956 resultados para Multi-View Rendering


Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper addresses the problem of obtaining 3d detailed reconstructions of human faces in real-time and with inexpensive hardware. We present an algorithm based on a monocular multi-spectral photometric-stereo setup. This system is known to capture high-detailed deforming 3d surfaces at high frame rates and without having to use any expensive hardware or synchronized light stage. However, the main challenge of such a setup is the calibration stage, which depends on the lights setup and how they interact with the specific material being captured, in this case, human faces. For this purpose we develop a self-calibration technique where the person being captured is asked to perform a rigid motion in front of the camera, maintaining a neutral expression. Rigidity constrains are then used to compute the head's motion with a structure-from-motion algorithm. Once the motion is obtained, a multi-view stereo algorithm reconstructs a coarse 3d model of the face. This coarse model is then used to estimate the lighting parameters with a stratified approach: In the first step we use a RANSAC search to identify purely diffuse points on the face and to simultaneously estimate this diffuse reflectance model. In the second step we apply non-linear optimization to fit a non-Lambertian reflectance model to the outliers of the previous step. The calibration procedure is validated with synthetic and real data.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The seminal multiple-view stereo benchmark evaluations from Middlebury and by Strecha et al. have played a major role in propelling the development of multi-view stereopsis (MVS) methodology. The somewhat small size and variability of these data sets, however, limit their scope and the conclusions that can be derived from them. To facilitate further development within MVS, we here present a new and varied data set consisting of 80 scenes, seen from 49 or 64 accurate camera positions. This is accompanied by accurate structured light scans for reference and evaluation. In addition all images are taken under seven different lighting conditions. As a benchmark and to validate the use of our data set for obtaining reasonable and statistically significant findings about MVS, we have applied the three state-of-the-art MVS algorithms by Campbell et al., Furukawa et al., and Tola et al. to the data set. To do this we have extended the evaluation protocol from the Middlebury evaluation, necessitated by the more complex geometry of some of our scenes. The data set and accompanying evaluation framework are made freely available online. Based on this evaluation, we are able to observe several characteristics of state-of-the-art MVS, e.g. that there is a tradeoff between the quality of the reconstructed 3D points (accuracy) and how much of an object’s surface is captured (completeness). Also, several issues that we hypothesized would challenge MVS, such as specularities and changing lighting conditions did not pose serious problems. Our study finds that the two most pressing issues for MVS are lack of texture and meshing (forming 3D points into closed triangulated surfaces).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Practice-led or multi modal theses (describing examinable outcomes of postgraduate study which comprise the practice of dancing/choreography with an accompanying exegesis) are an emerging strength of dance scholarship; a form of enquiry that has been gaining momentum for over a decade, particularly in Australia and the United Kingdom. It has been strongly argued that, in this form of research, legitimate claims to new knowledge are embodied predominantly within the practice itself (Pakes, 2003) and that these findings are emergent, contingent and often interstitial, contained within both the material form of the practice and in the symbolic languages surrounding the form. In a recent study on ‘dancing’ theses Phillips, Stock, Vincs (2009) found that there was general agreement from academics and artists that ‘there could be more flexibility in matching written language with conceptual thought expressed in practice’. The authors discuss how the seemingly intangible nature of danced / embodied research, reliant on what Melrose (2003) terms ‘performance mastery’ by the ‘expert practitioner’ (2006, Point 4) involving ‘expert’ intuition (2006, Point 5), might be accessed, articulated and validated in terms of alternative ways of knowing through exploring an ongoing dialogue in which the danced practice develops emergent theory. They also propose ways in which the danced thesis can be ‘converted’ into the required ‘durable’ artefact which the ephemerality of live performance denies, drawing on the work of Rye’s ‘multi-view’ digital record (2003) and Stapleton’s ‘multi-voiced audio visual document’(2006, 82). Building on a two-year research project (2007-2008) Dancing Between Diversity and Consistency: Refining Assessment in Postgraduate Degrees in Dance, which examined such issues in relation to assessment in an Australian context, the three researchers have further explored issues around interdisciplinarity, cultural differences and documentation through engaging with the following questions:  How do we represent research in which understandings, meanings and findings are situated within the body of the dancer/choreographer?  Do these need a form of ‘translating’ into textual form in order to be accessed as research?  What kind of language structures can be developed to effect this translation: metaphor, allusion, symbol?  How important is contextualising the creative practice?  How do we incorporate differing cultural inflections and practices into our reading and evaluation?  What kind of layered documentation can assist in producing a ‘durable’ research artefact from a non-reproduce-able live event?

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper proposes a new method of using foreground silhouette images for human pose estimation. Labels are introduced to the silhouette images, providing an extra layer of information that can be used in the model fitting process. The pixels in the silhouettes are labelled according to the corresponding body part in the model of the current fit, with the labels propagated into the silhouette of the next frame to be used in the fitting for the next frame. Both single and multi-view implementations are detailed, with results showing performance improvements over only using standard unlabelled silhouettes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In semisupervised learning (SSL), a predictive model is learn from a collection of labeled data and a typically much larger collection of unlabeled data. These paper presented a framework called multi-view point cloud regularization (MVPCR), which unifies and generalizes several semisupervised kernel methods that are based on data-dependent regularization in reproducing kernel Hilbert spaces (RKHSs). Special cases of MVPCR include coregularized least squares (CoRLS), manifold regularization (MR), and graph-based SSL. An accompanying theorem shows how to reduce any MVPCR problem to standard supervised learning with a new multi-view kernel.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The practice of robotics and computer vision each involve the application of computational algorithms to data. The research community has developed a very large body of algorithms but for a newcomer to the field this can be quite daunting. For more than 10 years the author has maintained two open-source MATLAB® Toolboxes, one for robotics and one for vision. They provide implementations of many important algorithms and allow users to work with real problems, not just trivial examples. This new book makes the fundamental algorithms of robotics, vision and control accessible to all. It weaves together theory, algorithms and examples in a narrative that covers robotics and computer vision separately and together. Using the latest versions of the Toolboxes the author shows how complex problems can be decomposed and solved using just a few simple lines of code. The topics covered are guided by real problems observed by the author over many years as a practitioner of both robotics and computer vision. It is written in a light but informative style, it is easy to read and absorb, and includes over 1000 MATLAB® and Simulink® examples and figures. The book is a real walk through the fundamentals of mobile robots, navigation, localization, arm-robot kinematics, dynamics and joint level control, then camera models, image processing, feature extraction and multi-view geometry, and finally bringing it all together with an extensive discussion of visual servo systems.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In the multi-view approach to semisupervised learning, we choose one predictor from each of multiple hypothesis classes, and we co-regularize our choices by penalizing disagreement among the predictors on the unlabeled data. We examine the co-regularization method used in the co-regularized least squares (CoRLS) algorithm, in which the views are reproducing kernel Hilbert spaces (RKHS's), and the disagreement penalty is the average squared difference in predictions. The final predictor is the pointwise average of the predictors from each view. We call the set of predictors that can result from this procedure the co-regularized hypothesis class. Our main result is a tight bound on the Rademacher complexity of the co-regularized hypothesis class in terms of the kernel matrices of each RKHS. We find that the co-regularization reduces the Rademacher complexity by an amount that depends on the distance between the two views, as measured by a data dependent metric. We then use standard techniques to bound the gap between training error and test error for the CoRLS algorithm. Experimentally, we find that the amount of reduction in complexity introduced by co regularization correlates with the amount of improvement that co-regularization gives in the CoRLS algorithm.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Gait energy images (GEIs) and its variants form the basis of many recent appearance-based gait recognition systems. The GEI combines good recognition performance with a simple implementation, though it suffers problems inherent to appearance-based approaches, such as being highly view dependent. In this paper, we extend the concept of the GEI to 3D, to create what we call the gait energy volume, or GEV. A basic GEV implementation is tested on the CMU MoBo database, showing improvements over both the GEI baseline and a fused multi-view GEI approach. We also demonstrate the efficacy of this approach on partial volume reconstructions created from frontal depth images, which can be more practically acquired, for example, in biometric portals implemented with stereo cameras, or other depth acquisition systems. Experiments on frontal depth images are evaluated on an in-house developed database captured using the Microsoft Kinect, and demonstrate the validity of the proposed approach.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We present an iterative hierarchical algorithm for multi-view stereo. The algorithm attempts to utilise as much contextual information as is available to compute highly accurate and robust depth maps. There are three novel aspects to the approach: 1) firstly we incrementally improve the depth fidelity as the algorithm progresses through the image pyramid; 2) secondly we show how to incorporate visual hull information (when available) to constrain depth searches; and 3) we show how to simultaneously enforce the consistency of the depth-map by continual comparison with neighbouring depth-maps. We show that this approach produces highly accurate depth-maps and, since it is essentially a local method, is both extremely fast and simple to implement.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Affect is an important feature of multimedia content and conveys valuable information for multimedia indexing and retrieval. Most existing studies for affective content analysis are limited to low-level features or mid-level representations, and are generally criticized for their incapacity to address the gap between low-level features and high-level human affective perception. The facial expressions of subjects in images carry important semantic information that can substantially influence human affective perception, but have been seldom investigated for affective classification of facial images towards practical applications. This paper presents an automatic image emotion detector (IED) for affective classification of practical (or non-laboratory) data using facial expressions, where a lot of “real-world” challenges are present, including pose, illumination, and size variations etc. The proposed method is novel, with its framework designed specifically to overcome these challenges using multi-view versions of face and fiducial point detectors, and a combination of point-based texture and geometry. Performance comparisons of several key parameters of relevant algorithms are conducted to explore the optimum parameters for high accuracy and fast computation speed. A comprehensive set of experiments with existing and new datasets, shows that the method is effective despite pose variations, fast, and appropriate for large-scale data, and as accurate as the method with state-of-the-art performance on laboratory-based data. The proposed method was also applied to affective classification of images from the British Broadcast Corporation (BBC) in a task typical for a practical application providing some valuable insights.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The core aim of machine learning is to make a computer program learn from the experience. Learning from data is usually defined as a task of learning regularities or patterns in data in order to extract useful information, or to learn the underlying concept. An important sub-field of machine learning is called multi-view learning where the task is to learn from multiple data sets or views describing the same underlying concept. A typical example of such scenario would be to study a biological concept using several biological measurements like gene expression, protein expression and metabolic profiles, or to classify web pages based on their content and the contents of their hyperlinks. In this thesis, novel problem formulations and methods for multi-view learning are presented. The contributions include a linear data fusion approach during exploratory data analysis, a new measure to evaluate different kinds of representations for textual data, and an extension of multi-view learning for novel scenarios where the correspondence of samples in the different views or data sets is not known in advance. In order to infer the one-to-one correspondence of samples between two views, a novel concept of multi-view matching is proposed. The matching algorithm is completely data-driven and is demonstrated in several applications such as matching of metabolites between humans and mice, and matching of sentences between documents in two languages.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents a volumetric formulation for the multi-view stereo problem which is amenable to a computationally tractable global optimisation using Graph-cuts. Our approach is to seek the optimal partitioning of 3D space into two regions labelled as "object" and "empty" under a cost functional consisting of the following two terms: (1) A term that forces the boundary between the two regions to pass through photo-consistent locations and (2) a ballooning term that inflates the "object" region. To take account of the effect of occlusion on the first term we use an occlusion robust photo-consistency metric based on Normalised Cross Correlation, which does not assume any geometric knowledge about the reconstructed object. The globally optimal 3D partitioning can be obtained as the minimum cut solution of a weighted graph.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We present a new co-clustering problem of images and visual features. The problem involves a set of non-object images in addition to a set of object images and features to be co-clustered. Co-clustering is performed in a way that maximises discrimination of object images from non-object images, thus emphasizing discriminative features. This provides a way of obtaining perceptual joint-clusters of object images and features. We tackle the problem by simultaneously boosting multiple strong classifiers which compete for images by their expertise. Each boosting classifier is an aggregation of weak-learners, i.e. simple visual features. The obtained classifiers are useful for object detection tasks which exhibit multimodalities, e.g. multi-category and multi-view object detection tasks. Experiments on a set of pedestrian images and a face data set demonstrate that the method yields intuitive image clusters with associated features and is much superior to conventional boosting classifiers in object detection tasks.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

本文以水下机器人的遥操作作业为应用背景 ,提出并实现了虚拟现实技术和视觉感知信息辅助机器人遥操作实验系统 .该系统使用了 CAD模型和立体视觉信息完成遥操作机器人及其作业环境的几何建模和运动学建模 ,实现了虚拟作业环境的生成和实时动态图形显示 .采用了基于立体视觉的虚拟环境与真实环境的一致性校正、图形图像叠加、作业体与环境位姿关系建立、基于网络的监控通讯等关键技术 .在这个实验系统中 ,操作人员可利用所生成的虚拟环境 ,在多视点、多窗口作业状态图形和图像显示帮助下 ,实时动态地进行作业观测与机器人遥操作与运动规划 ,为先进遥操作机器人系统的实现提供了经验和关键技术 .