10 resultados para Object vision

em Duke University


Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work explores the use of statistical methods in describing and estimating camera poses, as well as the information feedback loop between camera pose and object detection. Surging development in robotics and computer vision has pushed the need for algorithms that infer, understand, and utilize information about the position and orientation of the sensor platforms when observing and/or interacting with their environment.

The first contribution of this thesis is the development of a set of statistical tools for representing and estimating the uncertainty in object poses. A distribution for representing the joint uncertainty over multiple object positions and orientations is described, called the mirrored normal-Bingham distribution. This distribution generalizes both the normal distribution in Euclidean space, and the Bingham distribution on the unit hypersphere. It is shown to inherit many of the convenient properties of these special cases: it is the maximum-entropy distribution with fixed second moment, and there is a generalized Laplace approximation whose result is the mirrored normal-Bingham distribution. This distribution and approximation method are demonstrated by deriving the analytical approximation to the wrapped-normal distribution. Further, it is shown how these tools can be used to represent the uncertainty in the result of a bundle adjustment problem.

Another application of these methods is illustrated as part of a novel camera pose estimation algorithm based on object detections. The autocalibration task is formulated as a bundle adjustment problem using prior distributions over the 3D points to enforce the objects' structure and their relationship with the scene geometry. This framework is very flexible and enables the use of off-the-shelf computational tools to solve specialized autocalibration problems. Its performance is evaluated using a pedestrian detector to provide head and foot location observations, and it proves much faster and potentially more accurate than existing methods.

Finally, the information feedback loop between object detection and camera pose estimation is closed by utilizing camera pose information to improve object detection in scenarios with significant perspective warping. Methods are presented that allow the inverse perspective mapping traditionally applied to images to be applied instead to features computed from those images. For the special case of HOG-like features, which are used by many modern object detection systems, these methods are shown to provide substantial performance benefits over unadapted detectors while achieving real-time frame rates, orders of magnitude faster than comparable image warping methods.

The statistical tools and algorithms presented here are especially promising for mobile cameras, providing the ability to autocalibrate and adapt to the camera pose in real time. In addition, these methods have wide-ranging potential applications in diverse areas of computer vision, robotics, and imaging.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dissertation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recent evidence that echinoids of the genus Echinometra have moderate visual acuity that appears to be mediated by their spines screening off-axis light suggests that the urchin Strongylocentrotus purpuratus, with its higher spine density, may have even more acute spatial vision. We analyzed the movements of 39 specimens of S. purpuratus after they were placed in the center of a featureless tank containing a round, black target that had an angular diameter of 6.5 deg. or 10 deg. (solid angles of 0.01 sr and 0.024 sr, respectively). An average orientation vector for each urchin was determined by testing the animal four times, with the target placed successively at bearings of 0 deg., 90 deg., 180 deg. and 270 deg. (relative to magnetic east). The urchins showed no significant unimodal or axial orientation relative to any non-target feature of the environment or relative to the changing position of the 6.5 deg. target. However, the urchins were strongly axially oriented relative to the changing position of the 10 deg. target (mean axis from -1 to 179 deg.; 95% confidence interval +/- 12 deg.; P<0.001, Moore's non-parametric Hotelling's test), with 10 of the 20 urchins tested against that target choosing an average bearing within 10 deg. of either the target center or its opposite direction (two would be expected by chance). In addition, the average length of the 20 target-normalized bearings for the 10 deg. target (each the vector sum of the bearings for the four trials) were far higher than would be expected by chance (P<10(-10); Monte Carlo simulation), showing that each urchin, whether it moved towards or away from the target, did so with high consistency. These results strongly suggest that S. purpuratus detected the 10 deg. target, responding either by approaching it or fleeing it. Given that the urchins did not appear to respond to the 6.5 deg. target, it is likely that the 10 deg. target was close to the minimum detectable size for this species. Interestingly, measurements of the spine density of the regions of the test that faced horizontally predicted a similar visual resolution (8.3+/-0.5 deg. for the interambulacrum and 11+/-0.54 deg. for the ambulacrum). The function of this relatively low, but functional, acuity - on par with that of the chambered Nautilus and the horseshoe crab - is unclear but, given the bimodal response, is likely to be related to both shelter seeking and predator avoidance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we propose generalized sampling approaches for measuring a multi-dimensional object using a compact compound-eye imaging system called thin observation module by bound optics (TOMBO). This paper shows the proposed system model, physical examples, and simulations to verify TOMBO imaging using generalized sampling. In the system, an object is modulated and multiplied by a weight distribution with physical coding, and the coded optical signal is integrated on to a detector array. A numerical estimation algorithm employing a sparsity constraint is used for object reconstruction.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The ability to isolate a single sound source among concurrent sources and reverberant energy is necessary for understanding the auditory world. The precedence effect describes a related experimental finding, that when presented with identical sounds from two locations with a short onset asynchrony (on the order of milliseconds), listeners report a single source with a location dominated by the lead sound. Single-cell recordings in multiple animal models have indicated that there are low-level mechanisms that may contribute to the precedence effect, yet psychophysical studies in humans have provided evidence that top-down cognitive processes have a great deal of influence on the perception of simulated echoes. In the present study, event-related potentials evoked by click pairs at and around listeners' echo thresholds indicate that perception of the lead and lag sound as individual sources elicits a negativity between 100 and 250 msec, previously termed the object-related negativity (ORN). Even for physically identical stimuli, the ORN is evident when listeners report hearing, as compared with not hearing, a second sound source. These results define a neural mechanism related to the conscious perception of multiple auditory objects.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Our ability to track an object as the same persisting entity over time and motion may primarily rely on spatiotemporal representations which encode some, but not all, of an object's features. Previous researchers using the 'object reviewing' paradigm have demonstrated that such representations can store featural information of well-learned stimuli such as letters and words at a highly abstract level. However, it is unknown whether these representations can also store purely episodic information (i.e. information obtained from a single, novel encounter) that does not correspond to pre-existing type-representations in long-term memory. Here, in an object-reviewing experiment with novel face images as stimuli, observers still produced reliable object-specific preview benefits in dynamic displays: a preview of a novel face on a specific object speeded the recognition of that particular face at a later point when it appeared again on the same object compared to when it reappeared on a different object (beyond display-wide priming), even when all objects moved to new positions in the intervening delay. This case study demonstrates that the mid-level visual representations which keep track of persisting identity over time--e.g. 'object files', in one popular framework can store not only abstract types from long-term memory, but also specific tokens from online visual experience.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The early detection of developmental disorders is key to child outcome, allowing interventions to be initiated which promote development and improve prognosis. Research on autism spectrum disorder (ASD) suggests that behavioral signs can be observed late in the first year of life. Many of these studies involve extensive frame-by-frame video observation and analysis of a child's natural behavior. Although nonintrusive, these methods are extremely time-intensive and require a high level of observer training; thus, they are burdensome for clinical and large population research purposes. This work is a first milestone in a long-term project on non-invasive early observation of children in order to aid in risk detection and research of neurodevelopmental disorders. We focus on providing low-cost computer vision tools to measure and identify ASD behavioral signs based on components of the Autism Observation Scale for Infants (AOSI). In particular, we develop algorithms to measure responses to general ASD risk assessment tasks and activities outlined by the AOSI which assess visual attention by tracking facial features. We show results, including comparisons with expert and nonexpert clinicians, which demonstrate that the proposed computer vision tools can capture critical behavioral observations and potentially augment the clinician's behavioral observations obtained from real in-clinic assessments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The early detection of developmental disorders is key to child outcome, allowing interventions to be initiated that promote development and improve prognosis. Research on autism spectrum disorder (ASD) suggests behavioral markers can be observed late in the first year of life. Many of these studies involved extensive frame-by-frame video observation and analysis of a child's natural behavior. Although non-intrusive, these methods are extremely time-intensive and require a high level of observer training; thus, they are impractical for clinical and large population research purposes. Diagnostic measures for ASD are available for infants but are only accurate when used by specialists experienced in early diagnosis. This work is a first milestone in a long-term multidisciplinary project that aims at helping clinicians and general practitioners accomplish this early detection/measurement task automatically. We focus on providing computer vision tools to measure and identify ASD behavioral markers based on components of the Autism Observation Scale for Infants (AOSI). In particular, we develop algorithms to measure three critical AOSI activities that assess visual attention. We augment these AOSI activities with an additional test that analyzes asymmetrical patterns in unsupported gait. The first set of algorithms involves assessing head motion by tracking facial features, while the gait analysis relies on joint foreground segmentation and 2D body pose estimation in video. We show results that provide insightful knowledge to augment the clinician's behavioral observations obtained from real in-clinic assessments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

© 2005-2012 IEEE.Within industrial automation systems, three-dimensional (3-D) vision provides very useful feedback information in autonomous operation of various manufacturing equipment (e.g., industrial robots, material handling devices, assembly systems, and machine tools). The hardware performance in contemporary 3-D scanning devices is suitable for online utilization. However, the bottleneck is the lack of real-time algorithms for recognition of geometric primitives (e.g., planes and natural quadrics) from a scanned point cloud. One of the most important and the most frequent geometric primitive in various engineering tasks is plane. In this paper, we propose a new fast one-pass algorithm for recognition (segmentation and fitting) of planar segments from a point cloud. To effectively segment planar regions, we exploit the orthonormality of certain wavelets to polynomial function, as well as their sensitivity to abrupt changes. After segmentation of planar regions, we estimate the parameters of corresponding planes using standard fitting procedures. For point cloud structuring, a z-buffer algorithm with mesh triangles representation in barycentric coordinates is employed. The proposed recognition method is tested and experimentally validated in several real-world case studies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Periodic visual stimulation and analysis of the resulting steady-state visual evoked potentials were first introduced over 80 years ago as a means to study visual sensation and perception. From the first single-channel recording of responses to modulated light to the present use of sophisticated digital displays composed of complex visual stimuli and high-density recording arrays, steady-state methods have been applied in a broad range of scientific and applied settings.The purpose of this article is to describe the fundamental stimulation paradigms for steady-state visual evoked potentials and to illustrate these principles through research findings across a range of applications in vision science.