362 resultados para Recognition accuracy


Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new system is described for estimating volume from a series of multiplanar 2D ultrasound images. Ultrasound images are captured using a personal computer video digitizing card and an electromagnetic localization system is used to record the pose of the ultrasound images. The accuracy of the system was assessed by scanning four groups of ten cadaveric kidneys on four different ultrasound machines. Scan image planes were oriented either radially, in parallel or slanted at 30 C to the vertical. The cross-sectional images of the kidneys were traced using a mouse and the outline points transformed to 3D space using the Fastrak position and orientation data. Points on adjacent region of interest outlines were connected to form a triangle mesh and the volume of the kidneys estimated using the ellipsoid, planimetry, tetrahedral and ray tracing methods. There was little difference between the results for the different scan techniques or volume estimation algorithms, although, perhaps as expected, the ellipsoid results were the least precise. For radial scanning and ray tracing, the mean and standard deviation of the percentage errors for the four different machines were as follows: Hitachi EUB-240, −3.0 ± 2.7%; Tosbee RM3, −0.1 ± 2.3%; Hitachi EUB-415, 0.2 ± 2.3%; Acuson, 2.7 ± 2.3%.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In recent years, a number of phylogenetic methods have been developed for estimating molecular rates and divergence dates under models that relax the molecular clock constraint by allowing rate change throughout the tree. These methods are being used with increasing frequency, but there have been few studies into their accuracy. We tested the accuracy of several relaxed-clock methods (penalized likelihood and Bayesian inference using various models of rate change) using nucleotide sequences simulated on a nine-taxon tree. When the sequences evolved with a constant rate, the methods were able to infer rates accurately, but estimates were more precise when a molecular clock was assumed. When the sequences evolved under a model of autocorrelated rate change, rates were accurately estimated using penalized likelihood and by Bayesian inference using lognormal and exponential models of rate change, while other models did not perform as well. When the sequences evolved under a model of uncorrelated rate change, only Bayesian inference using an exponential rate model performed well. Collectively, the results provide a strong recommendation for using the exponential model of rate change if a conservative approach to divergence time estimation is required. A case study is presented in which we use a simulation-based approach to examine the hypothesis of elevated rates in the Cambrian period, and it is found that these high rate estimates might be an artifact of the rate estimation method. If this bias is present, then the ages of metazoan divergences would be systematically underestimated. The results of this study have implications for studies of molecular rates and divergence dates.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

While researchers strive to improve automatic face recognition performance, the relationship between image resolution and face recognition performance has not received much attention. This relationship is examined systematically and a framework is developed such that results from super-resolution techniques can be compared. Three super-resolution techniques are compared with the Eigenface and Elastic Bunch Graph Matching face recognition engines. Parameter ranges over which these techniques provide better recognition performance than interpolated images is determined.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper investigates the effects of limited speech data in the context of speaker verification using a probabilistic linear discriminant analysis (PLDA) approach. Being able to reduce the length of required speech data is important to the development of automatic speaker verification system in real world applications. When sufficient speech is available, previous research has shown that heavy-tailed PLDA (HTPLDA) modeling of speakers in the i-vector space provides state-of-the-art performance, however, the robustness of HTPLDA to the limited speech resources in development, enrolment and verification is an important issue that has not yet been investigated. In this paper, we analyze the speaker verification performance with regards to the duration of utterances used for both speaker evaluation (enrolment and verification) and score normalization and PLDA modeling during development. Two different approaches to total-variability representation are analyzed within the PLDA approach to show improved performance in short-utterance mismatched evaluation conditions and conditions for which insufficient speech resources are available for adequate system development. The results presented within this paper using the NIST 2008 Speaker Recognition Evaluation dataset suggest that the HTPLDA system can continue to achieve better performance than Gaussian PLDA (GPLDA) as evaluation utterance lengths are decreased. We also highlight the importance of matching durations for score normalization and PLDA modeling to the expected evaluation conditions. Finally, we found that a pooled total-variability approach to PLDA modeling can achieve better performance than the traditional concatenated total-variability approach for short utterances in mismatched evaluation conditions and conditions for which insufficient speech resources are available for adequate system development.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we use a sequence-based visual localization algorithm to reveal surprising answers to the question, how much visual information is actually needed to conduct effective navigation? The algorithm actively searches for the best local image matches within a sliding window of short route segments or 'sub-routes', and matches sub-routes by searching for coherent sequences of local image matches. In contract to many existing techniques, the technique requires no pre-training or camera parameter calibration. We compare the algorithm's performance to the state-of-the-art FAB-MAP 2.0 algorithm on a 70 km benchmark dataset. Performance matches or exceeds the state of the art feature-based localization technique using images as small as 4 pixels, fields of view reduced by a factor of 250, and pixel bit depths reduced to 2 bits. We present further results demonstrating the system localizing in an office environment with near 100% precision using two 7 bit Lego light sensors, as well as using 16 and 32 pixel images from a motorbike race and a mountain rally car stage. By demonstrating how little image information is required to achieve localization along a route, we hope to stimulate future 'low fidelity' approaches to visual navigation that complement probabilistic feature-based techniques.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For facial expression recognition systems to be applicable in the real world, they need to be able to detect and track a previously unseen person's face and its facial movements accurately in realistic environments. A highly plausible solution involves performing a "dense" form of alignment, where 60-70 fiducial facial points are tracked with high accuracy. The problem is that, in practice, this type of dense alignment had so far been impossible to achieve in a generic sense, mainly due to poor reliability and robustness. Instead, many expression detection methods have opted for a "coarse" form of face alignment, followed by an application of a biologically inspired appearance descriptor such as the histogram of oriented gradients or Gabor magnitudes. Encouragingly, recent advances to a number of dense alignment algorithms have demonstrated both high reliability and accuracy for unseen subjects [e.g., constrained local models (CLMs)]. This begs the question: Aside from countering against illumination variation, what do these appearance descriptors do that standard pixel representations do not? In this paper, we show that, when close to perfect alignment is obtained, there is no real benefit in employing these different appearance-based representations (under consistent illumination conditions). In fact, when misalignment does occur, we show that these appearance descriptors do work well by encoding robustness to alignment error. For this work, we compared two popular methods for dense alignment-subject-dependent active appearance models versus subject-independent CLMs-on the task of action-unit detection. These comparisons were conducted through a battery of experiments across various publicly available data sets (i.e., CK+, Pain, M3, and GEMEP-FERA). We also report our performance in the recent 2011 Facial Expression Recognition and Analysis Challenge for the subject-independent task.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Monitoring the natural environment is increasingly important as habit degradation and climate change reduce theworld’s biodiversity.We have developed software tools and applications to assist ecologists with the collection and analysis of acoustic data at large spatial and temporal scales.One of our key objectives is automated animal call recognition, and our approach has three novel attributes. First, we work with raw environmental audio, contaminated by noise and artefacts and containing calls that vary greatly in volume depending on the animal’s proximity to the microphone. Second, initial experimentation suggested that no single recognizer could dealwith the enormous variety of calls. Therefore, we developed a toolbox of generic recognizers to extract invariant features for each call type. Third, many species are cryptic and offer little data with which to train a recognizer. Many popular machine learning methods require large volumes of training and validation data and considerable time and expertise to prepare. Consequently we adopt bootstrap techniques that can be initiated with little data and refined subsequently. In this paper, we describe our recognition tools and present results for real ecological problems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The chief challenge facing persistent robotic navigation using vision sensors is the recognition of previously visited locations under different lighting and illumination conditions. The majority of successful approaches to outdoor robot navigation use active sensors such as LIDAR, but the associated weight and power draw of these systems makes them unsuitable for widespread deployment on mobile robots. In this paper we investigate methods to combine representations for visible and long-wave infrared (LWIR) thermal images with time information to combat the time-of-day-based limitations of each sensing modality. We calculate appearance-based match likelihoods using the state-of-the-art FAB-MAP [1] algorithm to analyse loop closure detection reliability across different times of day. We present preliminary results on a dataset of 10 successive traverses of a combined urban-parkland environment, recorded in 2-hour intervals from before dawn to after dusk. Improved location recognition throughout an entire day is demonstrated using the combined system compared with methods which use visible or thermal sensing alone.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

As one of the first institutional repositories in Australia and the first in the world to have an institution-wide deposit mandate, QUT ePrints has great ‘brand recognition’ within the University (Queensland University of Technology) and beyond. The repository is managed by the library but, over the years, the Library’s repository team has worked closely with other departments (especially the Office of Research and IT Services) to ensure that QUT ePrints was embedded into the business processes and systems our academics use regularly. For example, the repository is the source of the publication information which displays on each academic’s Staff Profile page. The repository pulls in citation data from Scopus and Web of Science and displays the data in the publications records. Researchers can monitor their citations at a glance via the repository ‘View’ which displays all their publications. A trend in recent years has been to populate institutional repositories with publication details imported from the University’s research information system (RIS). The main advantage of the RIS to Repository workflow is that it requires little input from the academics as the publication details are often imported into the RIS from publisher databases. Sadly, this is also its main disadvantage. Generally, only the metadata is imported from the RIS and the lack of engagement by the academics results in very low proportions of records with open access full-texts. Consequently, while we could see the value of integrating the two systems, we were determined to make the repository the entry point for publication data. In 2011, the University funded a project to convert a number of paper-based processes into web-based workflows. This included a workflow to replace the paper forms academics used to complete to report new publications (which were later used by the data entry staff to input the details into the RIS). Publication details and full-text files are uploaded to the repository (by the academics or their nominees). Each night, the repository (QUT ePrints) pushes the metadata for new publications into a holding table. The data is checked by Office of Research staff the next day and then ‘imported’ into the RIS. Publication details (including the repository URLs) are pushed from the RIS to the Staff Profiles system. Previously, academics were required to supply the Office of research with photocopies of their publication (for verification/auditing purposes). The repository is now the source of verification information. Library staff verify the accuracy of the publication details and, where applicable, the peer review status of the work. The verification metadata is included in the information passed to the Office of Research. The RIS at QUT comprises two separate systems built on an Oracle database; a proprietary product (ResearchMaster) plus a locally produced system known as RAD (Research Activity Database). The repository platform is EPrints which is built on a MySQL database. This partly explains why the data is passed from one system to the other via a holding table. The new workflow went live in early April 2012. Tests of the technical integration have all been successful. At the end of the first 12 months, the impact of the new workflow on the proportion of full-texts deposited will be evaluated.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Facial expression is an important channel of human social communication. Facial expression recognition (FER) aims to perceive and understand emotional states of humans based on information in the face. Building robust and high performance FER systems that can work in real-world video is still a challenging task, due to the various unpredictable facial variations and complicated exterior environmental conditions, as well as the difficulty of choosing a suitable type of feature descriptor for extracting discriminative facial information. Facial variations caused by factors such as pose, age, gender, race and occlusion, can exert profound influence on the robustness, while a suitable feature descriptor largely determines the performance. Most present attention on FER has been paid to addressing variations in pose and illumination. No approach has been reported on handling face localization errors and relatively few on overcoming facial occlusions, although the significant impact of these two variations on the performance has been proved and highlighted in many previous studies. Many texture and geometric features have been previously proposed for FER. However, few comparison studies have been conducted to explore the performance differences between different features and examine the performance improvement arisen from fusion of texture and geometry, especially on data with spontaneous emotions. The majority of existing approaches are evaluated on databases with posed or induced facial expressions collected in laboratory environments, whereas little attention has been paid on recognizing naturalistic facial expressions on real-world data. This thesis investigates techniques for building robust and high performance FER systems based on a number of established feature sets. It comprises of contributions towards three main objectives: (1) Robustness to face localization errors and facial occlusions. An approach is proposed to handle face localization errors and facial occlusions using Gabor based templates. Template extraction algorithms are designed to collect a pool of local template features and template matching is then performed to covert these templates into distances, which are robust to localization errors and occlusions. (2) Improvement of performance through feature comparison, selection and fusion. A comparative framework is presented to compare the performance between different features and different feature selection algorithms, and examine the performance improvement arising from fusion of texture and geometry. The framework is evaluated for both discrete and dimensional expression recognition on spontaneous data. (3) Evaluation of performance in the context of real-world applications. A system is selected and applied into discriminating posed versus spontaneous expressions and recognizing naturalistic facial expressions. A database is collected from real-world recordings and is used to explore feature differences between standard database images and real-world images, as well as between real-world images and real-world video frames. The performance evaluations are based on the JAFFE, CK, Feedtum, NVIE, Semaine and self-collected QUT databases. The results demonstrate high robustness of the proposed approach to the simulated localization errors and occlusions. Texture and geometry have different contributions to the performance of discrete and dimensional expression recognition, as well as posed versus spontaneous emotion discrimination. These investigations provide useful insights into enhancing robustness and achieving high performance of FER systems, and putting them into real-world applications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recent efforts in mission planning for underwater vehicles have utilised predictive models to aid in navigation, optimal path planning and drive opportunistic sampling. Although these models provide information at a unprecedented resolutions and have proven to increase accuracy and effectiveness in multiple campaigns, most are deterministic in nature. Thus, predictions cannot be incorporated into probabilistic planning frameworks, nor do they provide any metric on the variance or confidence of the output variables. In this paper, we provide an initial investigation into determining the confidence of ocean model predictions based on the results of multiple field deployments of two autonomous underwater vehicles. For multiple missions conducted over a two-month period in 2011, we compare actual vehicle executions to simulations of the same missions through the Regional Ocean Modeling System in an ocean region off the coast of southern California. This comparison provides a qualitative analysis of the current velocity predictions for areas within the selected deployment region. Ultimately, we present a spatial heat-map of the correlation between the ocean model predictions and the actual mission executions. Knowing where the model provides unreliable predictions can be incorporated into planners to increase the utility and application of the deterministic estimations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Quality based frame selection is a crucial task in video face recognition, to both improve the recognition rate and to reduce the computational cost. In this paper we present a framework that uses a variety of cues (face symmetry, sharpness, contrast, closeness of mouth, brightness and openness of the eye) to select the highest quality facial images available in a video sequence for recognition. Normalized feature scores are fused using a neural network and frames with high quality scores are used in a Local Gabor Binary Pattern Histogram Sequence based face recognition system. Experiments on the Honda/UCSD database shows that the proposed method selects the best quality face images in the video sequence, resulting in improved recognition performance.