347 resultados para object recognition


Relevância:

20.00% 20.00%

Publicador:

Resumo:

“The Relevance of Religion” is the title of a recent address delivered by The Honourable Chief Justice Murray Gleeson of the High Court of Australia.1 In making the point “about the continuing public importance of religion”, the Chief Justice referenced Lord Devlin’s contention that “no society has yet solved the problem of how to teach morality without religion”....

Relevância:

20.00% 20.00%

Publicador:

Resumo:

While researchers strive to improve automatic face recognition performance, the relationship between image resolution and face recognition performance has not received much attention. This relationship is examined systematically and a framework is developed such that results from super-resolution techniques can be compared. Three super-resolution techniques are compared with the Eigenface and Elastic Bunch Graph Matching face recognition engines. Parameter ranges over which these techniques provide better recognition performance than interpolated images is determined.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper investigates the effects of limited speech data in the context of speaker verification using a probabilistic linear discriminant analysis (PLDA) approach. Being able to reduce the length of required speech data is important to the development of automatic speaker verification system in real world applications. When sufficient speech is available, previous research has shown that heavy-tailed PLDA (HTPLDA) modeling of speakers in the i-vector space provides state-of-the-art performance, however, the robustness of HTPLDA to the limited speech resources in development, enrolment and verification is an important issue that has not yet been investigated. In this paper, we analyze the speaker verification performance with regards to the duration of utterances used for both speaker evaluation (enrolment and verification) and score normalization and PLDA modeling during development. Two different approaches to total-variability representation are analyzed within the PLDA approach to show improved performance in short-utterance mismatched evaluation conditions and conditions for which insufficient speech resources are available for adequate system development. The results presented within this paper using the NIST 2008 Speaker Recognition Evaluation dataset suggest that the HTPLDA system can continue to achieve better performance than Gaussian PLDA (GPLDA) as evaluation utterance lengths are decreased. We also highlight the importance of matching durations for score normalization and PLDA modeling to the expected evaluation conditions. Finally, we found that a pooled total-variability approach to PLDA modeling can achieve better performance than the traditional concatenated total-variability approach for short utterances in mismatched evaluation conditions and conditions for which insufficient speech resources are available for adequate system development.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we use a sequence-based visual localization algorithm to reveal surprising answers to the question, how much visual information is actually needed to conduct effective navigation? The algorithm actively searches for the best local image matches within a sliding window of short route segments or 'sub-routes', and matches sub-routes by searching for coherent sequences of local image matches. In contract to many existing techniques, the technique requires no pre-training or camera parameter calibration. We compare the algorithm's performance to the state-of-the-art FAB-MAP 2.0 algorithm on a 70 km benchmark dataset. Performance matches or exceeds the state of the art feature-based localization technique using images as small as 4 pixels, fields of view reduced by a factor of 250, and pixel bit depths reduced to 2 bits. We present further results demonstrating the system localizing in an office environment with near 100% precision using two 7 bit Lego light sensors, as well as using 16 and 32 pixel images from a motorbike race and a mountain rally car stage. By demonstrating how little image information is required to achieve localization along a route, we hope to stimulate future 'low fidelity' approaches to visual navigation that complement probabilistic feature-based techniques.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Monitoring the natural environment is increasingly important as habit degradation and climate change reduce theworld’s biodiversity.We have developed software tools and applications to assist ecologists with the collection and analysis of acoustic data at large spatial and temporal scales.One of our key objectives is automated animal call recognition, and our approach has three novel attributes. First, we work with raw environmental audio, contaminated by noise and artefacts and containing calls that vary greatly in volume depending on the animal’s proximity to the microphone. Second, initial experimentation suggested that no single recognizer could dealwith the enormous variety of calls. Therefore, we developed a toolbox of generic recognizers to extract invariant features for each call type. Third, many species are cryptic and offer little data with which to train a recognizer. Many popular machine learning methods require large volumes of training and validation data and considerable time and expertise to prepare. Consequently we adopt bootstrap techniques that can be initiated with little data and refined subsequently. In this paper, we describe our recognition tools and present results for real ecological problems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The chief challenge facing persistent robotic navigation using vision sensors is the recognition of previously visited locations under different lighting and illumination conditions. The majority of successful approaches to outdoor robot navigation use active sensors such as LIDAR, but the associated weight and power draw of these systems makes them unsuitable for widespread deployment on mobile robots. In this paper we investigate methods to combine representations for visible and long-wave infrared (LWIR) thermal images with time information to combat the time-of-day-based limitations of each sensing modality. We calculate appearance-based match likelihoods using the state-of-the-art FAB-MAP [1] algorithm to analyse loop closure detection reliability across different times of day. We present preliminary results on a dataset of 10 successive traverses of a combined urban-parkland environment, recorded in 2-hour intervals from before dawn to after dusk. Improved location recognition throughout an entire day is demonstrated using the combined system compared with methods which use visible or thermal sensing alone.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Audio-visualspeechrecognition, or the combination of visual lip-reading with traditional acoustic speechrecognition, has been previously shown to provide a considerable improvement over acoustic-only approaches in noisy environments, such as that present in an automotive cabin. The research presented in this paper will extend upon the established audio-visualspeechrecognition literature to show that further improvements in speechrecognition accuracy can be obtained when multiple frontal or near-frontal views of a speaker's face are available. A series of visualspeechrecognition experiments using a four-stream visual synchronous hidden Markov model (SHMM) are conducted on the four-camera AVICAR automotiveaudio-visualspeech database. We study the relative contribution between the side and central orientated cameras in improving visualspeechrecognition accuracy. Finally combination of the four visual streams with a single audio stream in a five-stream SHMM demonstrates a relative improvement of over 56% in word recognition accuracy when compared to the acoustic-only approach in the noisiest conditions of the AVICAR database.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Facial expression is an important channel of human social communication. Facial expression recognition (FER) aims to perceive and understand emotional states of humans based on information in the face. Building robust and high performance FER systems that can work in real-world video is still a challenging task, due to the various unpredictable facial variations and complicated exterior environmental conditions, as well as the difficulty of choosing a suitable type of feature descriptor for extracting discriminative facial information. Facial variations caused by factors such as pose, age, gender, race and occlusion, can exert profound influence on the robustness, while a suitable feature descriptor largely determines the performance. Most present attention on FER has been paid to addressing variations in pose and illumination. No approach has been reported on handling face localization errors and relatively few on overcoming facial occlusions, although the significant impact of these two variations on the performance has been proved and highlighted in many previous studies. Many texture and geometric features have been previously proposed for FER. However, few comparison studies have been conducted to explore the performance differences between different features and examine the performance improvement arisen from fusion of texture and geometry, especially on data with spontaneous emotions. The majority of existing approaches are evaluated on databases with posed or induced facial expressions collected in laboratory environments, whereas little attention has been paid on recognizing naturalistic facial expressions on real-world data. This thesis investigates techniques for building robust and high performance FER systems based on a number of established feature sets. It comprises of contributions towards three main objectives: (1) Robustness to face localization errors and facial occlusions. An approach is proposed to handle face localization errors and facial occlusions using Gabor based templates. Template extraction algorithms are designed to collect a pool of local template features and template matching is then performed to covert these templates into distances, which are robust to localization errors and occlusions. (2) Improvement of performance through feature comparison, selection and fusion. A comparative framework is presented to compare the performance between different features and different feature selection algorithms, and examine the performance improvement arising from fusion of texture and geometry. The framework is evaluated for both discrete and dimensional expression recognition on spontaneous data. (3) Evaluation of performance in the context of real-world applications. A system is selected and applied into discriminating posed versus spontaneous expressions and recognizing naturalistic facial expressions. A database is collected from real-world recordings and is used to explore feature differences between standard database images and real-world images, as well as between real-world images and real-world video frames. The performance evaluations are based on the JAFFE, CK, Feedtum, NVIE, Semaine and self-collected QUT databases. The results demonstrate high robustness of the proposed approach to the simulated localization errors and occlusions. Texture and geometry have different contributions to the performance of discrete and dimensional expression recognition, as well as posed versus spontaneous emotion discrimination. These investigations provide useful insights into enhancing robustness and achieving high performance of FER systems, and putting them into real-world applications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The serviceability and safety of bridges are crucial to people’s daily lives and to the national economy. Every effort should be taken to make sure that bridges function safely and properly as any damage or fault during the service life can lead to transport paralysis, catastrophic loss of property or even casualties. Nonetheless, aggressive environmental conditions, ever-increasing and changing traffic loads and aging can all contribute to bridge deterioration. With often constrained budget, it is of significance to identify bridges and bridge elements that should be given higher priority for maintenance, rehabilitation or replacement, and to select optimal strategy. Bridge health prediction is an essential underpinning science to bridge maintenance optimization, since the effectiveness of optimal maintenance decision is largely dependent on the forecasting accuracy of bridge health performance. The current approaches for bridge health prediction can be categorised into two groups: condition ratings based and structural reliability based. A comprehensive literature review has revealed the following limitations of the current modelling approaches: (1) it is not evident in literature to date that any integrated approaches exist for modelling both serviceability and safety aspects so that both performance criteria can be evaluated coherently; (2) complex system modelling approaches have not been successfully applied to bridge deterioration modelling though a bridge is a complex system composed of many inter-related bridge elements; (3) multiple bridge deterioration factors, such as deterioration dependencies among different bridge elements, observed information, maintenance actions and environmental effects have not been considered jointly; (4) the existing approaches are lacking in Bayesian updating ability to incorporate a variety of event information; (5) the assumption of series and/or parallel relationship for bridge level reliability is always held in all structural reliability estimation of bridge systems. To address the deficiencies listed above, this research proposes three novel models based on the Dynamic Object Oriented Bayesian Networks (DOOBNs) approach. Model I aims to address bridge deterioration in serviceability using condition ratings as the health index. The bridge deterioration is represented in a hierarchical relationship, in accordance with the physical structure, so that the contribution of each bridge element to bridge deterioration can be tracked. A discrete-time Markov process is employed to model deterioration of bridge elements over time. In Model II, bridge deterioration in terms of safety is addressed. The structural reliability of bridge systems is estimated from bridge elements to the entire bridge. By means of conditional probability tables (CPTs), not only series-parallel relationship but also complex probabilistic relationship in bridge systems can be effectively modelled. The structural reliability of each bridge element is evaluated from its limit state functions, considering the probability distributions of resistance and applied load. Both Models I and II are designed in three steps: modelling consideration, DOOBN development and parameters estimation. Model III integrates Models I and II to address bridge health performance in both serviceability and safety aspects jointly. The modelling of bridge ratings is modified so that every basic modelling unit denotes one physical bridge element. According to the specific materials used, the integration of condition ratings and structural reliability is implemented through critical failure modes. Three case studies have been conducted to validate the proposed models, respectively. Carefully selected data and knowledge from bridge experts, the National Bridge Inventory (NBI) and existing literature were utilised for model validation. In addition, event information was generated using simulation to demonstrate the Bayesian updating ability of the proposed models. The prediction results of condition ratings and structural reliability were presented and interpreted for basic bridge elements and the whole bridge system. The results obtained from Model II were compared with the ones obtained from traditional structural reliability methods. Overall, the prediction results demonstrate the feasibility of the proposed modelling approach for bridge health prediction and underpin the assertion that the three models can be used separately or integrated and are more effective than the current bridge deterioration modelling approaches. The primary contribution of this work is to enhance the knowledge in the field of bridge health prediction, where more comprehensive health performance in both serviceability and safety aspects are addressed jointly. The proposed models, characterised by probabilistic representation of bridge deterioration in hierarchical ways, demonstrated the effectiveness and pledge of DOOBNs approach to bridge health management. Additionally, the proposed models have significant potential for bridge maintenance optimization. Working together with advanced monitoring and inspection techniques, and a comprehensive bridge inventory, the proposed models can be used by bridge practitioners to achieve increased serviceability and safety as well as maintenance cost effectiveness.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The most common software analysis tools available for measuring fluorescence images are for two-dimensional (2D) data that rely on manual settings for inclusion and exclusion of data points, and computer-aided pattern recognition to support the interpretation and findings of the analysis. It has become increasingly important to be able to measure fluorescence images constructed from three-dimensional (3D) datasets in order to be able to capture the complexity of cellular dynamics and understand the basis of cellular plasticity within biological systems. Sophisticated microscopy instruments have permitted the visualization of 3D fluorescence images through the acquisition of multispectral fluorescence images and powerful analytical software that reconstructs the images from confocal stacks that then provide a 3D representation of the collected 2D images. Advanced design-based stereology methods have progressed from the approximation and assumptions of the original model-based stereology(1) even in complex tissue sections(2). Despite these scientific advances in microscopy, a need remains for an automated analytic method that fully exploits the intrinsic 3D data to allow for the analysis and quantification of the complex changes in cell morphology, protein localization and receptor trafficking. Current techniques available to quantify fluorescence images include Meta-Morph (Molecular Devices, Sunnyvale, CA) and Image J (NIH) which provide manual analysis. Imaris (Andor Technology, Belfast, Northern Ireland) software provides the feature MeasurementPro, which allows the manual creation of measurement points that can be placed in a volume image or drawn on a series of 2D slices to create a 3D object. This method is useful for single-click point measurements to measure a line distance between two objects or to create a polygon that encloses a region of interest, but it is difficult to apply to complex cellular network structures. Filament Tracer (Andor) allows automatic detection of the 3D neuronal filament-like however, this module has been developed to measure defined structures such as neurons, which are comprised of dendrites, axons and spines (tree-like structure). This module has been ingeniously utilized to make morphological measurements to non-neuronal cells(3), however, the output data provide information of an extended cellular network by using a software that depends on a defined cell shape rather than being an amorphous-shaped cellular model. To overcome the issue of analyzing amorphous-shaped cells and making the software more suitable to a biological application, Imaris developed Imaris Cell. This was a scientific project with the Eidgenössische Technische Hochschule, which has been developed to calculate the relationship between cells and organelles. While the software enables the detection of biological constraints, by forcing one nucleus per cell and using cell membranes to segment cells, it cannot be utilized to analyze fluorescence data that are not continuous because ideally it builds cell surface without void spaces. To our knowledge, at present no user-modifiable automated approach that provides morphometric information from 3D fluorescence images has been developed that achieves cellular spatial information of an undefined shape (Figure 1). We have developed an analytical platform using the Imaris core software module and Imaris XT interfaced to MATLAB (Mat Works, Inc.). These tools allow the 3D measurement of cells without a pre-defined shape and with inconsistent fluorescence network components. Furthermore, this method will allow researchers who have extended expertise in biological systems, but not familiarity to computer applications, to perform quantification of morphological changes in cell dynamics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Quality based frame selection is a crucial task in video face recognition, to both improve the recognition rate and to reduce the computational cost. In this paper we present a framework that uses a variety of cues (face symmetry, sharpness, contrast, closeness of mouth, brightness and openness of the eye) to select the highest quality facial images available in a video sequence for recognition. Normalized feature scores are fused using a neural network and frames with high quality scores are used in a Local Gabor Binary Pattern Histogram Sequence based face recognition system. Experiments on the Honda/UCSD database shows that the proposed method selects the best quality face images in the video sequence, resulting in improved recognition performance.