323 resultados para Digit speech recognition


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a novel place recognition algorithm inspired by the recent discovery of overlapping and multi-scale spatial maps in the rodent brain. We mimic this hierarchical framework by training arrays of Support Vector Machines to recognize places at multiple spatial scales. Place match hypotheses are then cross-validated across all spatial scales, a process which combines the spatial specificity of the finest spatial map with the consensus provided by broader mapping scales. Experiments on three real-world datasets including a large robotics benchmark demonstrate that mapping over multiple scales uniformly improves place recognition performance over a single scale approach without sacrificing localization accuracy. We present analysis that illustrates how matching over multiple scales leads to better place recognition performance and discuss several promising areas for future investigation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Previous behavioral studies reported a robust effect of increased naming latencies when objects to be named were blocked within semantic category, compared to items blocked between category. This semantic context effect has been attributed to various mechanisms including inhibition or excitation of lexico-semantic representations and incremental learning of associations between semantic features and names, and is hypothesized to increase demands on verbal self-monitoring during speech production. Objects within categories also share many visual structural features, introducing a potential confound when interpreting the level at which the context effect might occur. Consistent with previous findings, we report a significant increase in response latencies when naming categorically related objects within blocks, an effect associated with increased perfusion fMRI signal bilaterally in the hippocampus and in the left middle to posterior superior temporal cortex. No perfusion changes were observed in the middle section of the left middle temporal cortex, a region associated with retrieval of lexical-semantic information in previous object naming studies. Although a manipulation of visual feature similarity did not influence naming latencies, we observed perfusion increases in the perirhinal cortex for naming objects with similar visual features that interacted with the semantic context in which objects were named. These results provide support for the view that the semantic context effect in object naming occurs due to an incremental learning mechanism, and involves increased demands on verbal self-monitoring.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper investigates how neuronal activation for naming photographs of objects is influenced by the addition of appropriate colour or sound. Behaviourally, both colour and sound are known to facilitate object recognition from visual form. However, previous functional imaging studies have shown inconsistent effects. For example, the addition of appropriate colour has been shown to reduce antero-medial temporal activation whereas the addition of sound has been shown to increase posterior superior temporal activation. Here we compared the effect of adding colour or sound cues in the same experiment. We found that the addition of either the appropriate colour or sound increased activation for naming photographs of objects in bilateral occipital regions and the right anterior fusiform. Moreover, the addition of colour reduced left antero-medial temporal activation but this effect was not observed for the addition of object sound. We propose that activation in bilateral occipital and right fusiform areas precedes the integration of visual form with either its colour or associated sound. In contrast, left antero-medial temporal activation is reduced because object recognition is facilitated after colour and form have been integrated.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

RNA polymerase II (pol II) transcription termination requires co‐transcriptional recognition of a functional polyadenylation signal, but the molecular mechanisms that transduce this signal to pol II remain unclear. We show that Yhh1p/Cft1p, the yeast homologue of the mammalian AAUAAA interacting protein CPSF 160, is an RNA‐binding protein and provide evidence that it participates in poly(A) site recognition. Interestingly, RNA binding is mediated by a central domain composed of predicted β‐propeller‐forming repeats, which occurs in proteins of diverse cellular functions. We also found that Yhh1p/Cft1p bound specifically to the phosphorylated C‐terminal domain (CTD) of pol II in vitro and in a two‐hybrid test in vivo. Furthermore, transcriptional run‐on analysis demonstrated that yhh1 mutants were defective in transcription termination, suggesting that Yhh1p/Cft1p functions in the coupling of transcription and 3′‐end formation. We propose that direct interactions of Yhh1p/Cft1p with both the RNA transcript and the CTD are required to communicate poly(A) site recognition to elongating pol II to initiate transcription termination.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There is substantial evidence for facial emotion recognition (FER) deficits in autism spectrum disorder (ASD). The extent of this impairment, however, remains unclear, and there is some suggestion that clinical groups might benefit from the use of dynamic rather than static images. High-functioning individuals with ASD (n = 36) and typically developing controls (n = 36) completed a computerised FER task involving static and dynamic expressions of the six basic emotions. The ASD group showed poorer overall performance in identifying anger and disgust and were disadvantaged by dynamic (relative to static) stimuli when presented with sad expressions. Among both groups, however, dynamic stimuli appeared to improve recognition of anger. This research provides further evidence of specific impairment in the recognition of negative emotions in ASD, but argues against any broad advantages associated with the use of dynamic displays.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The solutions proposed in this thesis contribute to improve gait recognition performance in practical scenarios that further enable the adoption of gait recognition into real world security and forensic applications that require identifying humans at a distance. Pioneering work has been conducted on frontal gait recognition using depth images to allow gait to be integrated with biometric walkthrough portals. The effects of gait challenging conditions including clothing, carrying goods, and viewpoint have been explored. Enhanced approaches are proposed on segmentation, feature extraction, feature optimisation and classification elements, and state-of-the-art recognition performance has been achieved. A frontal depth gait database has been developed and made available to the research community for further investigation. Solutions are explored in 2D and 3D domains using multiple images sources, and both domain-specific and independent modality gait features are proposed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This PhD research has provided novel solutions to three major challenges which have prevented the wide spread deployment of speaker recognition technology: (1) combating enrolment/ verification mismatch, (2) reducing the large amount of development and training data that is required and (3) reducing the duration of speech required to verify a speaker. A range of applications of speaker recognition technology from forensics in criminal investigations to secure access in banking will benefit from the research outcomes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis investigates face recognition in video under the presence of large pose variations. It proposes a solution that performs simultaneous detection of facial landmarks and head poses across large pose variations, employs discriminative modelling of feature distributions of faces with varying poses, and applies fusion of multiple classifiers to pose-mismatch recognition. Experiments on several benchmark datasets have demonstrated that improved performance is achieved using the proposed solution.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper evaluates the performance of different text recognition techniques for a mobile robot in an indoor (university campus) environment. We compared four different methods: our own approach using existing text detection methods (Minimally Stable Extremal Regions detector and Stroke Width Transform) combined with a convolutional neural network, two modes of the open source program Tesseract, and the experimental mobile app Google Goggles. The results show that a convolutional neural network combined with the Stroke Width Transform gives the best performance in correctly matched text on images with single characters whereas Google Goggles gives the best performance on images with multiple words. The dataset used for this work is released as well.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Problem addressed Wrist-worn accelerometers are associated with greater compliance. However, validated algorithms for predicting activity type from wrist-worn accelerometer data are lacking. This study compared the activity recognition rates of an activity classifier trained on acceleration signal collected on the wrist and hip. Methodology 52 children and adolescents (mean age 13.7 +/- 3.1 year) completed 12 activity trials that were categorized into 7 activity classes: lying down, sitting, standing, walking, running, basketball, and dancing. During each trial, participants wore an ActiGraph GT3X+ tri-axial accelerometer on the right hip and the non-dominant wrist. Features were extracted from 10-s windows and inputted into a regularized logistic regression model using R (Glmnet + L1). Results Classification accuracy for the hip and wrist was 91.0% +/- 3.1% and 88.4% +/- 3.0%, respectively. The hip model exhibited excellent classification accuracy for sitting (91.3%), standing (95.8%), walking (95.8%), and running (96.8%); acceptable classification accuracy for lying down (88.3%) and basketball (81.9%); and modest accuracy for dance (64.1%). The wrist model exhibited excellent classification accuracy for sitting (93.0%), standing (91.7%), and walking (95.8%); acceptable classification accuracy for basketball (86.0%); and modest accuracy for running (78.8%), lying down (74.6%) and dance (69.4%). Potential Impact Both the hip and wrist algorithms achieved acceptable classification accuracy, allowing researchers to use either placement for activity recognition.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Vision-based place recognition involves recognising familiar places despite changes in environmental conditions or camera viewpoint (pose). Existing training-free methods exhibit excellent invariance to either of these challenges, but not both simultaneously. In this paper, we present a technique for condition-invariant place recognition across large lateral platform pose variance for vehicles or robots travelling along routes. Our approach combines sideways facing cameras with a new multi-scale image comparison technique that generates synthetic views for input into the condition-invariant Sequence Matching Across Route Traversals (SMART) algorithm. We evaluate the system’s performance on multi-lane roads in two different environments across day-night cycles. In the extreme case of day-night place recognition across the entire width of a four-lane-plus-median-strip highway, we demonstrate performance of up to 44% recall at 100% precision, where current state-of-the-art fails.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis demonstrates that robots can learn about how the world changes, and can use this information to recognise where they are, even when the appearance of the environment has changed a great deal. The ability to localise in highly dynamic environments using vision only is a key tool for achieving long-term, autonomous navigation in unstructured outdoor environments. The proposed learning algorithms are designed to be unsupervised, and can be generated by the robot online in response to its observations of the world, without requiring information from a human operator or other external source.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents an online, unsupervised training algorithm enabling vision-based place recognition across a wide range of changing environmental conditions such as those caused by weather, seasons, and day-night cycles. The technique applies principal component analysis to distinguish between aspects of a location’s appearance that are condition-dependent and those that are condition-invariant. Removing the dimensions associated with environmental conditions produces condition-invariant images that can be used by appearance-based place recognition methods. This approach has a unique benefit – it requires training images from only one type of environmental condition, unlike existing data-driven methods that require training images with labelled frame correspondences from two or more environmental conditions. The method is applied to two benchmark variable condition datasets. Performance is equivalent or superior to the current state of the art despite the lesser training requirements, and is demonstrated to generalise to previously unseen locations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recently Convolutional Neural Networks (CNNs) have been shown to achieve state-of-the-art performance on various classification tasks. In this paper, we present for the first time a place recognition technique based on CNN models, by combining the powerful features learnt by CNNs with a spatial and sequential filter. Applying the system to a 70 km benchmark place recognition dataset we achieve a 75% increase in recall at 100% precision, significantly outperforming all previous state of the art techniques. We also conduct a comprehensive performance comparison of the utility of features from all 21 layers for place recognition, both for the benchmark dataset and for a second dataset with more significant viewpoint changes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We describe a sequence of experiments investigating the strengths and limitations of Fukushima's neocognitron as a handwritten digit classifier. Using the results of these experiments as a foundation, we propose and evaluate improvements to Fukushima's original network in an effort to obtain higher recognition performance. The neocognitron's performance is shown to be strongly dependent on the choice of selectivity parameters and we present two methods to adjust these variables. Performance of the network under the more effective of the two new selectivity adjustment techniques suggests that the network fails to exploit the features that distinguish different classes of input data. To avoid this shortcoming, the network's final layer cells were replaced by a nonlinear classifier (a multilayer perceptron) to create a hybrid architecture. Tests of Fukushima's original system and the novel systems proposed in this paper suggest that it may be difficult for the neocognitron to achieve the performance of existing digit classifiers due to its reliance upon the supervisor's choice of selectivity parameters and training data. These findings pertain to Fukushima's implementation of the system and should not be seen as diminishing the practical significance of the concept of hierarchical feature extraction embodied in the neocognitron. © 1997 IEEE.