927 resultados para 3D object recognition


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Biometrics is afield of study which pursues the association of a person's identity with his/her physiological or behavioral characteristics.^ As one aspect of biometrics, face recognition has attracted special attention because it is a natural and noninvasive means to identify individuals. Most of the previous studies in face recognition are based on two-dimensional (2D) intensity images. Face recognition based on 2D intensity images, however, is sensitive to environment illumination and subject orientation changes, affecting the recognition results. With the development of three-dimensional (3D) scanners, 3D face recognition is being explored as an alternative to the traditional 2D methods for face recognition.^ This dissertation proposes a method in which the expression and the identity of a face are determined in an integrated fashion from 3D scans. In this framework, there is a front end expression recognition module which sorts the incoming 3D face according to the expression detected in the 3D scans. Then, scans with neutral expressions are processed by a corresponding 3D neutral face recognition module. Alternatively, if a scan displays a non-neutral expression, e.g., a smiling expression, it will be routed to an appropriate specialized recognition module for smiling face recognition.^ The expression recognition method proposed in this dissertation is innovative in that it uses information from 3D scans to perform the classification task. A smiling face recognition module was developed, based on the statistical modeling of the variance between faces with neutral expression and faces with a smiling expression.^ The proposed expression and face recognition framework was tested with a database containing 120 3D scans from 30 subjects (Half are neutral faces and half are smiling faces). It is shown that the proposed framework achieves a recognition rate 10% higher than attempting the identification with only the neutral face recognition module.^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Multi-resolution modelling has become essential as modern 3D applications demand 3D objects with higher LODs (LOD). Multi-modal devices such as PDAs and UMPCs do not have sufficient resources to handle the original 3D objects. The increased usage of collaborative applications has created many challenges for remote manipulation working with 3D objects of different quality. This paper studies how we can improve multi-resolution techniques by performing multiedge decimation and using annotative commands. It also investigates how devices with poorer quality 3D object can participate in collaborative actions.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Probabilistic robotics, most often applied to the problem of simultaneous localisation and mapping (SLAM), requires measures of uncertainly to accompany observations of the environment. This paper describes how uncertainly can be characterised for a vision system that locates coloured landmark in a typical laboratory environment. The paper describes a model of the uncertainly in segmentation, the internal camera model and the mounting of the camera on the robot. It =plains the implementation of the system on a laboratory robot, and provides experimental results that show the coherence of the uncertainly model,

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Eigen-based techniques and other monolithic approaches to face recognition have long been a cornerstone in the face recognition community due to the high dimensionality of face images. Eigen-face techniques provide minimal reconstruction error and limit high-frequency content while linear discriminant-based techniques (fisher-faces) allow the construction of subspaces which preserve discriminatory information. This paper presents a frequency decomposition approach for improved face recognition performance utilising three well-known techniques: Wavelets; Gabor / Log-Gabor; and the Discrete Cosine Transform. Experimentation illustrates that frequency domain partitioning prior to dimensionality reduction increases the information available for classification and greatly increases face recognition performance for both eigen-face and fisher-face approaches.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A new algorithm for extracting features from images for object recognition is described. The algorithm uses higher order spectra to provide desirable invariance properties, to provide noise immunity, and to incorporate nonlinearity into the feature extraction procedure thereby allowing the use of simple classifiers. An image can be reduced to a set of 1D functions via the Radon transform, or alternatively, the Fourier transform of each 1D projection can be obtained from a radial slice of the 2D Fourier transform of the image according to the Fourier slice theorem. A triple product of Fourier coefficients, referred to as the deterministic bispectrum, is computed for each 1D function and is integrated along radial lines in bifrequency space. Phases of the integrated bispectra are shown to be translation- and scale-invariant. Rotation invariance is achieved by a regrouping of these invariants at a constant radius followed by a second stage of invariant extraction. Rotation invariance is thus converted to translation invariance in the second step. Results using synthetic and actual images show that isolated, compact clusters are formed in feature space. These clusters are linearly separable, indicating that the nonlinearity required in the mapping from the input space to the classification space is incorporated well into the feature extraction stage. The use of higher order spectra results in good noise immunity, as verified with synthetic and real images. Classification of images using the higher order spectra-based algorithm compares favorably to classification using the method of moment invariants

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A new approach to pattern recognition using invariant parameters based on higher order spectra is presented. In particular, invariant parameters derived from the bispectrum are used to classify one-dimensional shapes. The bispectrum, which is translation invariant, is integrated along straight lines passing through the origin in bifrequency space. The phase of the integrated bispectrum is shown to be scale and amplification invariant, as well. A minimal set of these invariants is selected as the feature vector for pattern classification, and a minimum distance classifier using a statistical distance measure is used to classify test patterns. The classification technique is shown to distinguish two similar, but different bolts given their one-dimensional profiles. Pattern recognition using higher order spectral invariants is fast, suited for parallel implementation, and has high immunity to additive Gaussian noise. Simulation results show very high classification accuracy, even for low signal-to-noise ratios.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The most common software analysis tools available for measuring fluorescence images are for two-dimensional (2D) data that rely on manual settings for inclusion and exclusion of data points, and computer-aided pattern recognition to support the interpretation and findings of the analysis. It has become increasingly important to be able to measure fluorescence images constructed from three-dimensional (3D) datasets in order to be able to capture the complexity of cellular dynamics and understand the basis of cellular plasticity within biological systems. Sophisticated microscopy instruments have permitted the visualization of 3D fluorescence images through the acquisition of multispectral fluorescence images and powerful analytical software that reconstructs the images from confocal stacks that then provide a 3D representation of the collected 2D images. Advanced design-based stereology methods have progressed from the approximation and assumptions of the original model-based stereology(1) even in complex tissue sections(2). Despite these scientific advances in microscopy, a need remains for an automated analytic method that fully exploits the intrinsic 3D data to allow for the analysis and quantification of the complex changes in cell morphology, protein localization and receptor trafficking. Current techniques available to quantify fluorescence images include Meta-Morph (Molecular Devices, Sunnyvale, CA) and Image J (NIH) which provide manual analysis. Imaris (Andor Technology, Belfast, Northern Ireland) software provides the feature MeasurementPro, which allows the manual creation of measurement points that can be placed in a volume image or drawn on a series of 2D slices to create a 3D object. This method is useful for single-click point measurements to measure a line distance between two objects or to create a polygon that encloses a region of interest, but it is difficult to apply to complex cellular network structures. Filament Tracer (Andor) allows automatic detection of the 3D neuronal filament-like however, this module has been developed to measure defined structures such as neurons, which are comprised of dendrites, axons and spines (tree-like structure). This module has been ingeniously utilized to make morphological measurements to non-neuronal cells(3), however, the output data provide information of an extended cellular network by using a software that depends on a defined cell shape rather than being an amorphous-shaped cellular model. To overcome the issue of analyzing amorphous-shaped cells and making the software more suitable to a biological application, Imaris developed Imaris Cell. This was a scientific project with the Eidgenössische Technische Hochschule, which has been developed to calculate the relationship between cells and organelles. While the software enables the detection of biological constraints, by forcing one nucleus per cell and using cell membranes to segment cells, it cannot be utilized to analyze fluorescence data that are not continuous because ideally it builds cell surface without void spaces. To our knowledge, at present no user-modifiable automated approach that provides morphometric information from 3D fluorescence images has been developed that achieves cellular spatial information of an undefined shape (Figure 1). We have developed an analytical platform using the Imaris core software module and Imaris XT interfaced to MATLAB (Mat Works, Inc.). These tools allow the 3D measurement of cells without a pre-defined shape and with inconsistent fluorescence network components. Furthermore, this method will allow researchers who have extended expertise in biological systems, but not familiarity to computer applications, to perform quantification of morphological changes in cell dynamics.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper is concerned with the unsupervised learning of object representations by fusing visual and motor information. The problem is posed for a mobile robot that develops its representations as it incrementally gathers data. The scenario is problematic as the robot only has limited information at each time step with which it must generate and update its representations. Object representations are refined as multiple instances of sensory data are presented; however, it is uncertain whether two data instances are synonymous with the same object. This process can easily diverge from stability. The premise of the presented work is that a robot's motor information instigates successful generation of visual representations. An understanding of self-motion enables a prediction to be made before performing an action, resulting in a stronger belief of data association. The system is implemented as a data-driven partially observable semi-Markov decision process. Object representations are formed as the process's hidden states and are coordinated with motor commands through state transitions. Experiments show the prediction process is essential in enabling the unsupervised learning method to converge to a solution - improving precision and recall over using sensory data alone.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In recent years face recognition systems have been applied in various useful applications, such as surveillance, access control, criminal investigations, law enforcement, and others. However face biometric systems can be highly vulnerable to spoofing attacks where an impostor tries to bypass the face recognition system using a photo or video sequence. In this paper a novel liveness detection method, based on the 3D structure of the face, is proposed. Processing the 3D curvature of the acquired data, the proposed approach allows a biometric system to distinguish a real face from a photo, increasing the overall performance of the system and reducing its vulnerability. In order to test the real capability of the methodology a 3D face database has been collected simulating spoofing attacks, therefore using photographs instead of real faces. The experimental results show the effectiveness of the proposed approach.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper investigates how neuronal activation for naming photographs of objects is influenced by the addition of appropriate colour or sound. Behaviourally, both colour and sound are known to facilitate object recognition from visual form. However, previous functional imaging studies have shown inconsistent effects. For example, the addition of appropriate colour has been shown to reduce antero-medial temporal activation whereas the addition of sound has been shown to increase posterior superior temporal activation. Here we compared the effect of adding colour or sound cues in the same experiment. We found that the addition of either the appropriate colour or sound increased activation for naming photographs of objects in bilateral occipital regions and the right anterior fusiform. Moreover, the addition of colour reduced left antero-medial temporal activation but this effect was not observed for the addition of object sound. We propose that activation in bilateral occipital and right fusiform areas precedes the integration of visual form with either its colour or associated sound. In contrast, left antero-medial temporal activation is reduced because object recognition is facilitated after colour and form have been integrated.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper presents visual detection and classification of light vehicles and personnel on a mine site.We capitalise on the rapid advances of ConvNet based object recognition but highlight that a naive black box approach results in a significant number of false positives. In particular, the lack of domain specific training data and the unique landscape in a mine site causes a high rate of errors. We exploit the abundance of background-only images to train a k-means classifier to complement the ConvNet. Furthermore, localisation of objects of interest and a reduction in computation is enabled through region proposals. Our system is tested on over 10km of real mine site data and we were able to detect both light vehicles and personnel. We show that the introduction of our background model can reduce the false positive rate by an order of magnitude.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The environment moderates behaviour using a subtle language of ‘affordances’ and ‘behaviour-settings’. Affordances are environmental offerings. They are objects that demand action; a cliff demands a leap and binoculars demand a peek. Behaviour-settings are ‘places;’ spaces encoded with expectations and meanings. Behaviour-settings work the opposite way to affordances; they demand inhibition; an introspective demeanour in a church or when under surveillance. Most affordances and behaviour-settings are designed, and as such, designers are effectively predicting brain reactions. • Affordances are nested within, and moderated by behaviour-settings. Both trigger automatic neural responses (excitation and inhibition). These, for the best part cancel each other out. This balancing enables object recognition and allows choice about what action should be taken (if any). But when excitation exceeds inhibition, instinctive action will automatically commence. In positive circumstances this may mean laughter or a smile. In negative circumstances, fleeing, screaming or other panic responses are likely. People with poor frontal function, due to immaturity (childhood or developmental disorders) or due to hypofrontality (schizophrenia, brain damage or dementia) have a reduced capacity to balance excitatory and inhibitory impulses. For these people, environmental behavioural demands increase with the decline of frontal brain function. • The world around us is not only encoded with symbols and sensory information. Opportunities and restrictions work on a much more primal level. Person/space interactions constantly take place at a molecular scale. Every space we enter has its own special dynamic, where individualism vies for supremacy between the opposing forces of affordance-related excitation and the inhibition intrinsic to behaviour-settings. And in this context, even a small change–the installation of a CCTV camera can turn a circus to a prison. • This paper draws on cutting-edge neurological theory to understand the psychological determinates of the everyday experience of the designed environment.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The neural basis of visual perception can be understood only when the sequence of cortical activity underlying successful recognition is known. The early steps in this processing chain, from retina to the primary visual cortex, are highly local, and the perception of more complex shapes requires integration of the local information. In Study I of this thesis, the progression from local to global visual analysis was assessed by recording cortical magnetoencephalographic (MEG) responses to arrays of elements that either did or did not form global contours. The results demonstrated two spatially and temporally distinct stages of processing: The first, emerging 70 ms after stimulus onset around the calcarine sulcus, was sensitive to local features only, whereas the second, starting at 130 ms across the occipital and posterior parietal cortices, reflected the global configuration. To explore the links between cortical activity and visual recognition, Studies II III presented subjects with recognition tasks of varying levels of difficulty. The occipito-temporal responses from 150 ms onwards were closely linked to recognition performance, in contrast to the 100-ms mid-occipital responses. The averaged responses increased gradually as a function of recognition performance, and further analysis (Study III) showed the single response strengths to be graded as well. Study IV addressed the attention dependence of the different processing stages: Occipito-temporal responses peaking around 150 ms depended on the content of the visual field (faces vs. houses), whereas the later and more sustained activity was strongly modulated by the observers attention. Hemodynamic responses paralleled the pattern of the more sustained electrophysiological responses. Study V assessed the temporal processing capacity of the human object recognition system. Above sufficient luminance, contrast and size of the object, the processing speed was not limited by such low-level factors. Taken together, these studies demonstrate several distinct stages in the cortical activation sequence underlying the object recognition chain, reflecting the level of feature integration, difficulty of recognition, and direction of attention.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A new liquid crystal device structure has been developed using a vertically grown Multi-Wall Carbon NanoTube (MWCNT) as a 3D electrode structure, which allows complicated phase only hologram to be displayed using conventional liquid crystal materials. The nanotubes act as an individual electrode sites that generate an electric field profile, dictating the refractive index profile with the liquid crystal cell. Changing the electric field applied makes it possible to tune the properties to modulate the light in an ideal kinoform. A perfect 3D image can be generated by a computer generated hologram by using the diffraction of the light from the hologram pixels to create an optical wave front that appears to come from 3D object. A multilevel phase modulating device based on nematic LC's is also under progress, which will be used with the LC/CNT devices on an LCOS backplane to project a full 3D image from the kinoform.