Biblioteca Digital

40 resultados para Multi-view geometry

em Queensland University of Technology - ePrints Archive

Practical improvements to simultaneous computation of multi-view geometry and radial lens distortion

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper discusses practical issues related to the use of the division model for lens distortion in multi-view geometry computation. A data normalisation strategy is presented, which has been absent from previous discussions on the topic. The convergence properties of the Rectangular Quadric Eigenvalue Problem solution for computing division model distortion are examined. It is shown that the existing method can require more than 1000 iterations when dealing with severe distortion. A method is presented for accelerating convergence to less than 10 iterations for any amount of distortion. The new method is shown to produce equivalent or better results than the existing method with up to two orders of magnitude reduction in iterations. Through detailed simulation it is found that the number of data points used to compute geometry and lens distortion has a strong influence on convergence speed and solution accuracy. It is recommended that more than the minimal number of data points be used when computing geometry using a robust estimator such as RANSAC. Adding two to four extra samples improves the convergence rate and accuracy sufficiently to compensate for the increased number of samples required by the RANSAC process.

Evaluation of two-view geometry methods with automatic ground-truth generation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A large number of methods have been published that aim to evaluate various components of multi-view geometry systems. Most of these have focused on the feature extraction, description and matching stages (the visual front end), since geometry computation can be evaluated through simulation. Many data sets are constrained to small scale scenes or planar scenes that are not challenging to new algorithms, or require special equipment. This paper presents a method for automatically generating geometry ground truth and challenging test cases from high spatio-temporal resolution video. The objective of the system is to enable data collection at any physical scale, in any location and in various parts of the electromagnetic spectrum. The data generation process consists of collecting high resolution video, computing accurate sparse 3D reconstruction, video frame culling and down sampling, and test case selection. The evaluation process consists of applying a test 2-view geometry method to every test case and comparing the results to the ground truth. This system facilitates the evaluation of the whole geometry computation process or any part thereof against data compatible with a realistic application. A collection of example data sets and evaluations is included to demonstrate the range of applications of the proposed system.

3D ellipsoid fitting for multi-view gait recognition

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Gait recognition approaches continue to struggle with challenges including view-invariance, low-resolution data, robustness to unconstrained environments, and fluctuating gait patterns due to subjects carrying goods or wearing different clothes. Although computationally expensive, model based techniques offer promise over appearance based techniques for these challenges as they gather gait features and interpret gait dynamics in skeleton form. In this paper, we propose a fast 3D ellipsoidal-based gait recognition algorithm using a 3D voxel model derived from multi-view silhouette images. This approach directly solves the limitations of view dependency and self-occlusion in existing ellipse fitting model-based approaches. Voxel models are segmented into four components (left and right legs, above and below the knee), and ellipsoids are fitted to each region using eigenvalue decomposition. Features derived from the ellipsoid parameters are modeled using a Fourier representation to retain the temporal dynamic pattern for classification. We demonstrate the proposed approach using the CMU MoBo database and show that an improvement of 15-20% can be achieved over a 2D ellipse fitting baseline.

An adaptive spherical view representation for navigation in changing environments

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Real-world environments such as houses and offices change over time, meaning that a mobile robot’s map will become out of date. In previous work we introduced a method to update the reference views in a topological map so that a mobile robot could continue to localize itself in a changing environment using omni-directional vision. In this work we extend this longterm updating mechanism to incorporate a spherical metric representation of the observed visual features for each node in the topological map. Using multi-view geometry we are then able to estimate the heading of the robot, in order to enable navigation between the nodes of the map, and to simultaneously adapt the spherical view representation in response to environmental changes. The results demonstrate the persistent performance of the proposed system in a long-term experiment.

Long-term experiments with an adaptive spherical view representation for navigation in changing environments

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Real-world environments such as houses and offices change over time, meaning that a mobile robot’s map will become out of date. In this work, we introduce a method to update the reference views in a hybrid metrictopological map so that a mobile robot can continue to localize itself in a changing environment. The updating mechanism, based on the multi-store model of human memory, incorporates a spherical metric representation of the observed visual features for each node in the map, which enables the robot to estimate its heading and navigate using multi-view geometry, as well as representing the local 3D geometry of the environment. A series of experiments demonstrate the persistence performance of the proposed system in real changing environments, including analysis of the long-term stability.

Robotics, Vision and Control : Fundamental Algorithms in MATLAB

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The practice of robotics and computer vision each involve the application of computational algorithms to data. The research community has developed a very large body of algorithms but for a newcomer to the field this can be quite daunting. For more than 10 years the author has maintained two open-source MATLAB® Toolboxes, one for robotics and one for vision. They provide implementations of many important algorithms and allow users to work with real problems, not just trivial examples. This new book makes the fundamental algorithms of robotics, vision and control accessible to all. It weaves together theory, algorithms and examples in a narrative that covers robotics and computer vision separately and together. Using the latest versions of the Toolboxes the author shows how complex problems can be decomposed and solved using just a few simple lines of code. The topics covered are guided by real problems observed by the author over many years as a practitioner of both robotics and computer vision. It is written in a light but informative style, it is easy to read and absorb, and includes over 1000 MATLAB® and Simulink® examples and figures. The book is a real walk through the fundamentals of mobile robots, navigation, localization, arm-robot kinematics, dynamics and joint level control, then camera models, image processing, feature extraction and multi-view geometry, and finally bringing it all together with an extensive discussion of visual servo systems.

Towards robust automatic affective classification of images using facial expressions for practical applications

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Affect is an important feature of multimedia content and conveys valuable information for multimedia indexing and retrieval. Most existing studies for affective content analysis are limited to low-level features or mid-level representations, and are generally criticized for their incapacity to address the gap between low-level features and high-level human affective perception. The facial expressions of subjects in images carry important semantic information that can substantially influence human affective perception, but have been seldom investigated for affective classification of facial images towards practical applications. This paper presents an automatic image emotion detector (IED) for affective classification of practical (or non-laboratory) data using facial expressions, where a lot of “real-world” challenges are present, including pose, illumination, and size variations etc. The proposed method is novel, with its framework designed specifically to overcome these challenges using multi-view versions of face and fiducial point detectors, and a combination of point-based texture and geometry. Performance comparisons of several key parameters of relevant algorithms are conducted to explore the optimum parameters for high accuracy and fast computation speed. A comprehensive set of experiments with existing and new datasets, shows that the method is effective despite pose variations, fast, and appropriate for large-scale data, and as accurate as the method with state-of-the-art performance on laboratory-based data. The proposed method was also applied to affective classification of images from the British Broadcast Corporation (BBC) in a task typical for a practical application providing some valuable insights.

Dancing the Thesis : Potentials and Pitfalls in Practice-led Research

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Practice-led or multi modal theses (describing examinable outcomes of postgraduate study which comprise the practice of dancing/choreography with an accompanying exegesis) are an emerging strength of dance scholarship; a form of enquiry that has been gaining momentum for over a decade, particularly in Australia and the United Kingdom. It has been strongly argued that, in this form of research, legitimate claims to new knowledge are embodied predominantly within the practice itself (Pakes, 2003) and that these findings are emergent, contingent and often interstitial, contained within both the material form of the practice and in the symbolic languages surrounding the form. In a recent study on ‘dancing’ theses Phillips, Stock, Vincs (2009) found that there was general agreement from academics and artists that ‘there could be more flexibility in matching written language with conceptual thought expressed in practice’. The authors discuss how the seemingly intangible nature of danced / embodied research, reliant on what Melrose (2003) terms ‘performance mastery’ by the ‘expert practitioner’ (2006, Point 4) involving ‘expert’ intuition (2006, Point 5), might be accessed, articulated and validated in terms of alternative ways of knowing through exploring an ongoing dialogue in which the danced practice develops emergent theory. They also propose ways in which the danced thesis can be ‘converted’ into the required ‘durable’ artefact which the ephemerality of live performance denies, drawing on the work of Rye’s ‘multi-view’ digital record (2003) and Stapleton’s ‘multi-voiced audio visual document’(2006, 82). Building on a two-year research project (2007-2008) Dancing Between Diversity and Consistency: Refining Assessment in Postgraduate Degrees in Dance, which examined such issues in relation to assessment in an Australian context, the three researchers have further explored issues around interdisciplinarity, cultural differences and documentation through engaging with the following questions:  How do we represent research in which understandings, meanings and findings are situated within the body of the dancer/choreographer?  Do these need a form of ‘translating’ into textual form in order to be accessed as research?  What kind of language structures can be developed to effect this translation: metaphor, allusion, symbol?  How important is contextualising the creative practice?  How do we incorporate differing cultural inflections and practices into our reading and evaluation?  What kind of layered documentation can assist in producing a ‘durable’ research artefact from a non-reproduce-able live event?

Labelled silhouettes for human pose estimation

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper proposes a new method of using foreground silhouette images for human pose estimation. Labels are introduced to the silhouette images, providing an extra layer of information that can be used in the model fitting process. The pixels in the silhouettes are labelled according to the corresponding body part in the model of the current fit, with the labels propagated into the silhouette of the next frame to be used in the fitting for the next frame. Both single and multi-view implementations are detailed, with results showing performance improvements over only using standard unlabelled silhouettes.

Multiview point cloud kernels for semisupervised learning [Lecture Notes]

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In semisupervised learning (SSL), a predictive model is learn from a collection of labeled data and a typically much larger collection of unlabeled data. These paper presented a framework called multi-view point cloud regularization (MVPCR), which unifies and generalizes several semisupervised kernel methods that are based on data-dependent regularization in reproducing kernel Hilbert spaces (RKHSs). Special cases of MVPCR include coregularized least squares (CoRLS), manifold regularization (MR), and graph-based SSL. An accompanying theorem shows how to reduce any MVPCR problem to standard supervised learning with a new multi-view kernel.

The rademacher complexity of coregularized kernel classes

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In the multi-view approach to semisupervised learning, we choose one predictor from each of multiple hypothesis classes, and we co-regularize our choices by penalizing disagreement among the predictors on the unlabeled data. We examine the co-regularization method used in the co-regularized least squares (CoRLS) algorithm, in which the views are reproducing kernel Hilbert spaces (RKHS's), and the disagreement penalty is the average squared difference in predictions. The final predictor is the pointwise average of the predictors from each view. We call the set of predictors that can result from this procedure the co-regularized hypothesis class. Our main result is a tight bound on the Rademacher complexity of the co-regularized hypothesis class in terms of the kernel matrices of each RKHS. We find that the co-regularization reduces the Rademacher complexity by an amount that depends on the distance between the two views, as measured by a data dependent metric. We then use standard techniques to bound the gap between training error and test error for the CoRLS algorithm. Experimentally, we find that the amount of reduction in complexity introduced by co regularization correlates with the amount of improvement that co-regularization gives in the CoRLS algorithm.

Gait energy volumes and frontal gait recognition using depth images

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Gait energy images (GEIs) and its variants form the basis of many recent appearance-based gait recognition systems. The GEI combines good recognition performance with a simple implementation, though it suffers problems inherent to appearance-based approaches, such as being highly view dependent. In this paper, we extend the concept of the GEI to 3D, to create what we call the gait energy volume, or GEV. A basic GEV implementation is tested on the CMU MoBo database, showing improvements over both the GEI baseline and a fused multi-view GEI approach. We also demonstrate the efficacy of this approach on partial volume reconstructions created from frontal depth images, which can be more practically acquired, for example, in biometric portals implemented with stereo cameras, or other depth acquisition systems. Experiments on frontal depth images are evaluated on an in-house developed database captured using the Microsoft Kinect, and demonstrate the validity of the proposed approach.

A semi-local method for iterative depth-map refinement

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We present an iterative hierarchical algorithm for multi-view stereo. The algorithm attempts to utilise as much contextual information as is available to compute highly accurate and robust depth maps. There are three novel aspects to the approach: 1) firstly we incrementally improve the depth fidelity as the algorithm progresses through the image pyramid; 2) secondly we show how to incorporate visual hull information (when available) to constrain depth searches; and 3) we show how to simultaneously enforce the consistency of the depth-map by continual comparison with neighbouring depth-maps. We show that this approach produces highly accurate depth-maps and, since it is essentially a local method, is both extremely fast and simple to implement.

Multi-level knowledge transfer in software development outsourcing projects : the agency theory view

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In recent years, software development outsourcing has become even more complex. Outsourcing partner have begun‘re- outsourcing’ components of their projects to other outsourcing companies to minimize cost and gain efficiencies, creating a multi-level hierarchy of outsourcing. This research in progress paper presents preliminary findings of a study designed to understand knowledge transfer effectiveness of multi-level software development outsourcing projects. We conceptualize the SD-outsourcing entities using the Agency Theory. This study conceptualizes, operationalises and validates the concept of Knowledge Transfer as a three-phase multidimensional formative index of 1) Domain knowledge, 2) Communication behaviors, and 3) Clarity of requirements. Data analysis identified substantial, significant differences between the Principal and the Agent on two of the three constructs. Using Agency Theory, supported by preliminary findings, the paper also provides prescriptive guidelines of reducing the friction between the Principal and the Agent in multi-level software outsourcing.

A multi-layered approach for site detection in UAS emergency landing scenarios using geometry-based image segmentation

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents an alternative approach to image segmentation by using the spatial distribution of edge pixels as opposed to pixel intensities. The segmentation is achieved by a multi-layered approach and is intended to find suitable landing areas for an aircraft emergency landing. We combine standard techniques (edge detectors) with novel developed algorithms (line expansion and geometry test) to design an original segmentation algorithm. Our approach removes the dependency on environmental factors that traditionally influence lighting conditions, which in turn have negative impact on pixel-based segmentation techniques. We present test outcomes on realistic visual data collected from an aircraft, reporting on preliminary feedback about the performance of the detection. We demonstrate consistent performances over 97% detection rate.

«
1
2
3
»