874 resultados para 3D object recognition system


Relevância:

30.00% 30.00%

Publicador:

Resumo:

An application of image processing techniques to recognition of hand-drawn circuit diagrams is presented. The scanned image of a diagram is pre-processed to remove noise and converted to bilevel. Morphological operations are applied to obtain a clean, connected representation using thinned lines. The diagram comprises of nodes, connections and components. Nodes and components are segmented using appropriate thresholds on a spatially varying object pixel density. Connection paths are traced using a pixel-stack. Nodes are classified using syntactic analysis. Components are classified using a combination of invariant moments, scalar pixel-distribution features, and vector relationships between straight lines in polygonal representations. A node recognition accuracy of 82% and a component recognition accuracy of 86% was achieved on a database comprising 107 nodes and 449 components. This recogniser can be used for layout “beautification” or to generate input code for circuit analysis and simulation packages

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A system to segment and recognize Australian 4-digit postcodes from address labels on parcels is described. Images of address labels are preprocessed and adaptively thresholded to reduce noise. Projections are used to segment the line and then the characters comprising the postcode. Individual digits are recognized using bispectral features extracted from their parallel beam projections. These features are insensitive to translation, scaling and rotation, and robust to noise. Results on scanned images are presented. The system is currently being improved and implemented to work on-line.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new technique is proposed for learning the dynamic characteristics of a deformable object, applied in particular to the problem of lip-tracking. Experimental results are given which demonstrate that the use of dynamic models allows the system to track more robustly under adverse conditions and to correct spurious, poorly tracked frames

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Gait energy images (GEIs) and its variants form the basis of many recent appearance-based gait recognition systems. The GEI combines good recognition performance with a simple implementation, though it suffers problems inherent to appearance-based approaches, such as being highly view dependent. In this paper, we extend the concept of the GEI to 3D, to create what we call the gait energy volume, or GEV. A basic GEV implementation is tested on the CMU MoBo database, showing improvements over both the GEI baseline and a fused multi-view GEI approach. We also demonstrate the efficacy of this approach on partial volume reconstructions created from frontal depth images, which can be more practically acquired, for example, in biometric portals implemented with stereo cameras, or other depth acquisition systems. Experiments on frontal depth images are evaluated on an in-house developed database captured using the Microsoft Kinect, and demonstrate the validity of the proposed approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a novel technique for performing SLAM along a continuous trajectory of appearance. Derived from components of FastSLAM and FAB-MAP, the new system dubbed Continuous Appearance-based Trajectory SLAM (CAT-SLAM) augments appearancebased place recognition with particle-filter based ‘pose filtering’ within a probabilistic framework, without calculating global feature geometry or performing 3D map construction. For loop closure detection CAT-SLAM updates in constant time regardless of map size. We evaluate the effectiveness of CAT-SLAM on a 16km outdoor road network and determine its loop closure performance relative to FAB-MAP. CAT-SLAM recognizes 3 times the number of loop closures for the case where no false positives occur, demonstrating its potential use for robust loop closure detection in large environments.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An interactive installation with full body interface, digital projection, multi-touch sensitive screen surfaces, interactive 3D gaming software, motorised dioramas, 4.1 spatial sound & new furniture forms - investigating the cultural dimensions of sustainability through the lens of 'time'. “Time is change, time is finitude. Humans are a finite species. Every decision we make today brings that end closer, or alternatively pushes it further away. Nothing can be neutral”. Tony Fry DETAILS: Finitude (Mallee:Time) is a major new media/sculptural hybrid work premiered in 2011 in version 1 at the Ka-rama Motel for the Mildura Palimpsest #8 ('Collaborators and Saboteurs'). Each participant/viewer lies comfortably on their back on the double bed of Room 22. Directly above them, supported by a wooden structure, not unlike a house frame, is a semi-transparent Perspex screen that displays projected 3D imagery and is simultaneously sensitive to the lightest of finger touches. Depending upon the ever changing qualities of the projected image on this screen the participant can see through its surface to a series of physical dioramas suspended above, lit by subtle LED spotlighting. This diorama consists of a slowly rotating series of physical environments, which also include several animatronic components, allowing the realtime composition of whimsical ‘landscapes’ of both 'real' and 'virtual' media. Through subtle, non-didactic touch-sensitive interactivity the participant then has influence over both the 3D graphic imagery, the physical movements of the diorama and the 4.1 immersive soundscape, creating an uncanny blend of physical and virtual media. Five speakers positioned around the room deliver a rich interactive soundscape that responds both audibly and physically to interactions. VERSION 1, CONTEXT/THEORY: Finitude (Mallee: Time) is Version 1 of a series of presentations during 2012-14. This version has been inspired through a series of recent visits and residencies in the SW Victoria Mallee country. Further drawing on recent writings by post colonial author Paul Carter, the work is envisaged as an evolving ‘personal topography’ of place-discovery. By contrasting and melding readily available generalisations of the Mallee regions’ rational surfaces, climatic maps and ecological systems with what Carter calls “a fine capillary system of interconnected words, places, memories and sensations” generated through my own idiosyncratic research processes, Finitude (Mallee Time) invokes a “dark writing” of place through outside eyes - an approach that avoids concentration upon what 'everyone else knows', to instead imagine and develop a sense how things might be. This basis in re-imagining and re-invention becomes the vehicle for the work’s more fundamental intention - as a meditative re-imagination of 'time' (and region) as finite resources: Towards this end, every object, process and idea in the work is re-thought as having its own ‘time component’ or ‘residue’ that becomes deposited into our 'collective future'. Thought this way Finitude (Mallee Time) suggests the poverty of predominant images of time as ‘mechanism’ to instead envisage time as a plastic cyclical medium that we can each choose to ‘give to’ or ‘take away from’ our future. Put another way - time has become finitude.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose an approach to employ eigen light-fields for face recognition across pose on video. Faces of a subject are collected from video frames and combined based on the pose to obtain a set of probe light-fields. These probe data are then projected to the principal subspace of the eigen light-fields within which the classification takes place. We modify the original light-field projection and found that it is more robust in the proposed system. Evaluation on VidTIMIT dataset has demonstrated that the eigen light-fields method is able to take advantage of multiple observations contained in the video.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Accurate and detailed road models play an important role in a number of geospatial applications, such as infrastructure planning, traffic monitoring, and driver assistance systems. In this thesis, an integrated approach for the automatic extraction of precise road features from high resolution aerial images and LiDAR point clouds is presented. A framework of road information modeling has been proposed, for rural and urban scenarios respectively, and an integrated system has been developed to deal with road feature extraction using image and LiDAR analysis. For road extraction in rural regions, a hierarchical image analysis is first performed to maximize the exploitation of road characteristics in different resolutions. The rough locations and directions of roads are provided by the road centerlines detected in low resolution images, both of which can be further employed to facilitate the road information generation in high resolution images. The histogram thresholding method is then chosen to classify road details in high resolution images, where color space transformation is used for data preparation. After the road surface detection, anisotropic Gaussian and Gabor filters are employed to enhance road pavement markings while constraining other ground objects, such as vegetation and houses. Afterwards, pavement markings are obtained from the filtered image using the Otsu's clustering method. The final road model is generated by superimposing the lane markings on the road surfaces, where the digital terrain model (DTM) produced by LiDAR data can also be combined to obtain the 3D road model. As the extraction of roads in urban areas is greatly affected by buildings, shadows, vehicles, and parking lots, we combine high resolution aerial images and dense LiDAR data to fully exploit the precise spectral and horizontal spatial resolution of aerial images and the accurate vertical information provided by airborne LiDAR. Objectoriented image analysis methods are employed to process the feature classiffcation and road detection in aerial images. In this process, we first utilize an adaptive mean shift (MS) segmentation algorithm to segment the original images into meaningful object-oriented clusters. Then the support vector machine (SVM) algorithm is further applied on the MS segmented image to extract road objects. Road surface detected in LiDAR intensity images is taken as a mask to remove the effects of shadows and trees. In addition, normalized DSM (nDSM) obtained from LiDAR is employed to filter out other above-ground objects, such as buildings and vehicles. The proposed road extraction approaches are tested using rural and urban datasets respectively. The rural road extraction method is performed using pan-sharpened aerial images of the Bruce Highway, Gympie, Queensland. The road extraction algorithm for urban regions is tested using the datasets of Bundaberg, which combine aerial imagery and LiDAR data. Quantitative evaluation of the extracted road information for both datasets has been carried out. The experiments and the evaluation results using Gympie datasets show that more than 96% of the road surfaces and over 90% of the lane markings are accurately reconstructed, and the false alarm rates for road surfaces and lane markings are below 3% and 2% respectively. For the urban test sites of Bundaberg, more than 93% of the road surface is correctly reconstructed, and the mis-detection rate is below 10%.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We address the problem of face recognition on video by employing the recently proposed probabilistic linear discrimi-nant analysis (PLDA). The PLDA has been shown to be robust against pose and expression in image-based face recognition. In this research, the method is extended and applied to video where image set to image set matching is performed. We investigate two approaches of computing similarities between image sets using the PLDA: the closest pair approach and the holistic sets approach. To better model face appearances in video, we also propose the heteroscedastic version of the PLDA which learns the within-class covariance of each individual separately. Our experi-ments on the VidTIMIT and Honda datasets show that the combination of the heteroscedastic PLDA and the closest pair approach achieves the best performance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Facial expression is one of the main issues of face recognition in uncontrolled environments. In this paper, we apply the probabilistic linear discriminant analysis (PLDA) method to recognize faces across expressions. Several PLDA approaches are tested and cross-evaluated on the Cohn-Kanade and JAFFE databases. With less samples per gallery subject, high recognition rates comparable to previous works have been achieved indicating the robustness of the approaches. Among the approaches, the mixture of PLDAs has demonstrated better performances. The experimental results also indicate that facial regions around the cheeks, eyes, and eyebrows are more discriminative than regions around the mouth, jaw, chin, and nose.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes a new system, dubbed Continuous Appearance-based Trajectory Simultaneous Localisation and Mapping (CAT-SLAM), which augments sequential appearance-based place recognition with local metric pose filtering to improve the frequency and reliability of appearance-based loop closure. As in other approaches to appearance-based mapping, loop closure is performed without calculating global feature geometry or performing 3D map construction. Loop-closure filtering uses a probabilistic distribution of possible loop closures along the robot’s previous trajectory, which is represented by a linked list of previously visited locations linked by odometric information. Sequential appearance-based place recognition and local metric pose filtering are evaluated simultaneously using a Rao–Blackwellised particle filter, which weights particles based on appearance matching over sequential frames and the similarity of robot motion along the trajectory. The particle filter explicitly models both the likelihood of revisiting previous locations and exploring new locations. A modified resampling scheme counters particle deprivation and allows loop-closure updates to be performed in constant time for a given environment. We compare the performance of CAT-SLAM with FAB-MAP (a state-of-the-art appearance-only SLAM algorithm) using multiple real-world datasets, demonstrating an increase in the number of correct loop closures detected by CAT-SLAM.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new system is described for estimating volume from a series of multiplanar 2D ultrasound images. Ultrasound images are captured using a personal computer video digitizing card and an electromagnetic localization system is used to record the pose of the ultrasound images. The accuracy of the system was assessed by scanning four groups of ten cadaveric kidneys on four different ultrasound machines. Scan image planes were oriented either radially, in parallel or slanted at 30 C to the vertical. The cross-sectional images of the kidneys were traced using a mouse and the outline points transformed to 3D space using the Fastrak position and orientation data. Points on adjacent region of interest outlines were connected to form a triangle mesh and the volume of the kidneys estimated using the ellipsoid, planimetry, tetrahedral and ray tracing methods. There was little difference between the results for the different scan techniques or volume estimation algorithms, although, perhaps as expected, the ellipsoid results were the least precise. For radial scanning and ray tracing, the mean and standard deviation of the percentage errors for the four different machines were as follows: Hitachi EUB-240, −3.0 ± 2.7%; Tosbee RM3, −0.1 ± 2.3%; Hitachi EUB-415, 0.2 ± 2.3%; Acuson, 2.7 ± 2.3%.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis develops a detailed conceptual design method and a system software architecture defined with a parametric and generative evolutionary design system to support an integrated interdisciplinary building design approach. The research recognises the need to shift design efforts toward the earliest phases of the design process to support crucial design decisions that have a substantial cost implication on the overall project budget. The overall motivation of the research is to improve the quality of designs produced at the author's employer, the General Directorate of Major Works (GDMW) of the Saudi Arabian Armed Forces. GDMW produces many buildings that have standard requirements, across a wide range of environmental and social circumstances. A rapid means of customising designs for local circumstances would have significant benefits. The research considers the use of evolutionary genetic algorithms in the design process and the ability to generate and assess a wider range of potential design solutions than a human could manage. This wider ranging assessment, during the early stages of the design process, means that the generated solutions will be more appropriate for the defined design problem. The research work proposes a design method and system that promotes a collaborative relationship between human creativity and the computer capability. The tectonic design approach is adopted as a process oriented design that values the process of design as much as the product. The aim is to connect the evolutionary systems to performance assessment applications, which are used as prioritised fitness functions. This will produce design solutions that respond to their environmental and function requirements. This integrated, interdisciplinary approach to design will produce solutions through a design process that considers and balances the requirements of all aspects of the design. Since this thesis covers a wide area of research material, 'methodological pluralism' approach was used, incorporating both prescriptive and descriptive research methods. Multiple models of research were combined and the overall research was undertaken following three main stages, conceptualisation, developmental and evaluation. The first two stages lay the foundations for the specification of the proposed system where key aspects of the system that have not previously been proven in the literature, were implemented to test the feasibility of the system. As a result of combining the existing knowledge in the area with the newlyverified key aspects of the proposed system, this research can form the base for a future software development project. The evaluation stage, which includes building the prototype system to test and evaluate the system performance based on the criteria defined in the earlier stage, is not within the scope this thesis. The research results in a conceptual design method and a proposed system software architecture. The proposed system is called the 'Hierarchical Evolutionary Algorithmic Design (HEAD) System'. The HEAD system has shown to be feasible through the initial illustrative paper-based simulation. The HEAD system consists of the two main components - 'Design Schema' and the 'Synthesis Algorithms'. The HEAD system reflects the major research contribution in the way it is conceptualised, while secondary contributions are achieved within the system components. The design schema provides constraints on the generation of designs, thus enabling the designer to create a wide range of potential designs that can then be analysed for desirable characteristics. The design schema supports the digital representation of the human creativity of designers into a dynamic design framework that can be encoded and then executed through the use of evolutionary genetic algorithms. The design schema incorporates 2D and 3D geometry and graph theory for space layout planning and building formation using the Lowest Common Design Denominator (LCDD) of a parameterised 2D module and a 3D structural module. This provides a bridge between the standard adjacency requirements and the evolutionary system. The use of graphs as an input to the evolutionary algorithm supports the introduction of constraints in a way that is not supported by standard evolutionary techniques. The process of design synthesis is guided as a higher level description of the building that supports geometrical constraints. The Synthesis Algorithms component analyses designs at four levels, 'Room', 'Layout', 'Building' and 'Optimisation'. At each level multiple fitness functions are embedded into the genetic algorithm to target the specific requirements of the relevant decomposed part of the design problem. Decomposing the design problem to allow for the design requirements of each level to be dealt with separately and then reassembling them in a bottom up approach reduces the generation of non-viable solutions through constraining the options available at the next higher level. The iterative approach, in exploring the range of design solutions through modification of the design schema as the understanding of the design problem improves, assists in identifying conflicts in the design requirements. Additionally, the hierarchical set-up allows the embedding of multiple fitness functions into the genetic algorithm, each relevant to a specific level. This supports an integrated multi-level, multi-disciplinary approach. The HEAD system promotes a collaborative relationship between human creativity and the computer capability. The design schema component, as the input to the procedural algorithms, enables the encoding of certain aspects of the designer's subjective creativity. By focusing on finding solutions for the relevant sub-problems at the appropriate levels of detail, the hierarchical nature of the system assist in the design decision-making process.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper investigates the effects of limited speech data in the context of speaker verification using a probabilistic linear discriminant analysis (PLDA) approach. Being able to reduce the length of required speech data is important to the development of automatic speaker verification system in real world applications. When sufficient speech is available, previous research has shown that heavy-tailed PLDA (HTPLDA) modeling of speakers in the i-vector space provides state-of-the-art performance, however, the robustness of HTPLDA to the limited speech resources in development, enrolment and verification is an important issue that has not yet been investigated. In this paper, we analyze the speaker verification performance with regards to the duration of utterances used for both speaker evaluation (enrolment and verification) and score normalization and PLDA modeling during development. Two different approaches to total-variability representation are analyzed within the PLDA approach to show improved performance in short-utterance mismatched evaluation conditions and conditions for which insufficient speech resources are available for adequate system development. The results presented within this paper using the NIST 2008 Speaker Recognition Evaluation dataset suggest that the HTPLDA system can continue to achieve better performance than Gaussian PLDA (GPLDA) as evaluation utterance lengths are decreased. We also highlight the importance of matching durations for score normalization and PLDA modeling to the expected evaluation conditions. Finally, we found that a pooled total-variability approach to PLDA modeling can achieve better performance than the traditional concatenated total-variability approach for short utterances in mismatched evaluation conditions and conditions for which insufficient speech resources are available for adequate system development.