993 resultados para handwritten recognition


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a novel place recognition algorithm inspired by the recent discovery of overlapping and multi-scale spatial maps in the rodent brain. We mimic this hierarchical framework by training arrays of Support Vector Machines to recognize places at multiple spatial scales. Place match hypotheses are then cross-validated across all spatial scales, a process which combines the spatial specificity of the finest spatial map with the consensus provided by broader mapping scales. Experiments on three real-world datasets including a large robotics benchmark demonstrate that mapping over multiple scales uniformly improves place recognition performance over a single scale approach without sacrificing localization accuracy. We present analysis that illustrates how matching over multiple scales leads to better place recognition performance and discuss several promising areas for future investigation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

To identify and categorize complex stimuli such as familiar objects or speech, the human brain integrates information that is abstracted at multiple levels from its sensory inputs. Using cross-modal priming for spoken words and sounds, this functional magnetic resonance imaging study identified 3 distinct classes of visuoauditory incongruency effects: visuoauditory incongruency effects were selective for 1) spoken words in the left superior temporal sulcus (STS), 2) environmental sounds in the left angular gyrus (AG), and 3) both words and sounds in the lateral and medial prefrontal cortices (IFS/mPFC). From a cognitive perspective, these incongruency effects suggest that prior visual information influences the neural processes underlying speech and sound recognition at multiple levels, with the STS being involved in phonological, AG in semantic, and mPFC/IFS in higher conceptual processing. In terms of neural mechanisms, effective connectivity analyses (dynamic causal modeling) suggest that these incongruency effects may emerge via greater bottom-up effects from early auditory regions to intermediate multisensory integration areas (i.e., STS and AG). This is consistent with a predictive coding perspective on hierarchical Bayesian inference in the cortex where the domain of the prediction error (phonological vs. semantic) determines its regional expression (middle temporal gyrus/STS vs. AG/intraparietal sulcus).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

RNA polymerase II (pol II) transcription termination requires co‐transcriptional recognition of a functional polyadenylation signal, but the molecular mechanisms that transduce this signal to pol II remain unclear. We show that Yhh1p/Cft1p, the yeast homologue of the mammalian AAUAAA interacting protein CPSF 160, is an RNA‐binding protein and provide evidence that it participates in poly(A) site recognition. Interestingly, RNA binding is mediated by a central domain composed of predicted β‐propeller‐forming repeats, which occurs in proteins of diverse cellular functions. We also found that Yhh1p/Cft1p bound specifically to the phosphorylated C‐terminal domain (CTD) of pol II in vitro and in a two‐hybrid test in vivo. Furthermore, transcriptional run‐on analysis demonstrated that yhh1 mutants were defective in transcription termination, suggesting that Yhh1p/Cft1p functions in the coupling of transcription and 3′‐end formation. We propose that direct interactions of Yhh1p/Cft1p with both the RNA transcript and the CTD are required to communicate poly(A) site recognition to elongating pol II to initiate transcription termination.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There is substantial evidence for facial emotion recognition (FER) deficits in autism spectrum disorder (ASD). The extent of this impairment, however, remains unclear, and there is some suggestion that clinical groups might benefit from the use of dynamic rather than static images. High-functioning individuals with ASD (n = 36) and typically developing controls (n = 36) completed a computerised FER task involving static and dynamic expressions of the six basic emotions. The ASD group showed poorer overall performance in identifying anger and disgust and were disadvantaged by dynamic (relative to static) stimuli when presented with sad expressions. Among both groups, however, dynamic stimuli appeared to improve recognition of anger. This research provides further evidence of specific impairment in the recognition of negative emotions in ASD, but argues against any broad advantages associated with the use of dynamic displays.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The solutions proposed in this thesis contribute to improve gait recognition performance in practical scenarios that further enable the adoption of gait recognition into real world security and forensic applications that require identifying humans at a distance. Pioneering work has been conducted on frontal gait recognition using depth images to allow gait to be integrated with biometric walkthrough portals. The effects of gait challenging conditions including clothing, carrying goods, and viewpoint have been explored. Enhanced approaches are proposed on segmentation, feature extraction, feature optimisation and classification elements, and state-of-the-art recognition performance has been achieved. A frontal depth gait database has been developed and made available to the research community for further investigation. Solutions are explored in 2D and 3D domains using multiple images sources, and both domain-specific and independent modality gait features are proposed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis investigates face recognition in video under the presence of large pose variations. It proposes a solution that performs simultaneous detection of facial landmarks and head poses across large pose variations, employs discriminative modelling of feature distributions of faces with varying poses, and applies fusion of multiple classifiers to pose-mismatch recognition. Experiments on several benchmark datasets have demonstrated that improved performance is achieved using the proposed solution.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper evaluates the performance of different text recognition techniques for a mobile robot in an indoor (university campus) environment. We compared four different methods: our own approach using existing text detection methods (Minimally Stable Extremal Regions detector and Stroke Width Transform) combined with a convolutional neural network, two modes of the open source program Tesseract, and the experimental mobile app Google Goggles. The results show that a convolutional neural network combined with the Stroke Width Transform gives the best performance in correctly matched text on images with single characters whereas Google Goggles gives the best performance on images with multiple words. The dataset used for this work is released as well.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Problem addressed Wrist-worn accelerometers are associated with greater compliance. However, validated algorithms for predicting activity type from wrist-worn accelerometer data are lacking. This study compared the activity recognition rates of an activity classifier trained on acceleration signal collected on the wrist and hip. Methodology 52 children and adolescents (mean age 13.7 +/- 3.1 year) completed 12 activity trials that were categorized into 7 activity classes: lying down, sitting, standing, walking, running, basketball, and dancing. During each trial, participants wore an ActiGraph GT3X+ tri-axial accelerometer on the right hip and the non-dominant wrist. Features were extracted from 10-s windows and inputted into a regularized logistic regression model using R (Glmnet + L1). Results Classification accuracy for the hip and wrist was 91.0% +/- 3.1% and 88.4% +/- 3.0%, respectively. The hip model exhibited excellent classification accuracy for sitting (91.3%), standing (95.8%), walking (95.8%), and running (96.8%); acceptable classification accuracy for lying down (88.3%) and basketball (81.9%); and modest accuracy for dance (64.1%). The wrist model exhibited excellent classification accuracy for sitting (93.0%), standing (91.7%), and walking (95.8%); acceptable classification accuracy for basketball (86.0%); and modest accuracy for running (78.8%), lying down (74.6%) and dance (69.4%). Potential Impact Both the hip and wrist algorithms achieved acceptable classification accuracy, allowing researchers to use either placement for activity recognition.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Vision-based place recognition involves recognising familiar places despite changes in environmental conditions or camera viewpoint (pose). Existing training-free methods exhibit excellent invariance to either of these challenges, but not both simultaneously. In this paper, we present a technique for condition-invariant place recognition across large lateral platform pose variance for vehicles or robots travelling along routes. Our approach combines sideways facing cameras with a new multi-scale image comparison technique that generates synthetic views for input into the condition-invariant Sequence Matching Across Route Traversals (SMART) algorithm. We evaluate the system’s performance on multi-lane roads in two different environments across day-night cycles. In the extreme case of day-night place recognition across the entire width of a four-lane-plus-median-strip highway, we demonstrate performance of up to 44% recall at 100% precision, where current state-of-the-art fails.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis demonstrates that robots can learn about how the world changes, and can use this information to recognise where they are, even when the appearance of the environment has changed a great deal. The ability to localise in highly dynamic environments using vision only is a key tool for achieving long-term, autonomous navigation in unstructured outdoor environments. The proposed learning algorithms are designed to be unsupervised, and can be generated by the robot online in response to its observations of the world, without requiring information from a human operator or other external source.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents an online, unsupervised training algorithm enabling vision-based place recognition across a wide range of changing environmental conditions such as those caused by weather, seasons, and day-night cycles. The technique applies principal component analysis to distinguish between aspects of a location’s appearance that are condition-dependent and those that are condition-invariant. Removing the dimensions associated with environmental conditions produces condition-invariant images that can be used by appearance-based place recognition methods. This approach has a unique benefit – it requires training images from only one type of environmental condition, unlike existing data-driven methods that require training images with labelled frame correspondences from two or more environmental conditions. The method is applied to two benchmark variable condition datasets. Performance is equivalent or superior to the current state of the art despite the lesser training requirements, and is demonstrated to generalise to previously unseen locations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recently Convolutional Neural Networks (CNNs) have been shown to achieve state-of-the-art performance on various classification tasks. In this paper, we present for the first time a place recognition technique based on CNN models, by combining the powerful features learnt by CNNs with a spatial and sequential filter. Applying the system to a 70 km benchmark place recognition dataset we achieve a 75% increase in recall at 100% precision, significantly outperforming all previous state of the art techniques. We also conduct a comprehensive performance comparison of the utility of features from all 21 layers for place recognition, both for the benchmark dataset and for a second dataset with more significant viewpoint changes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We describe a sequence of experiments investigating the strengths and limitations of Fukushima's neocognitron as a handwritten digit classifier. Using the results of these experiments as a foundation, we propose and evaluate improvements to Fukushima's original network in an effort to obtain higher recognition performance. The neocognitron's performance is shown to be strongly dependent on the choice of selectivity parameters and we present two methods to adjust these variables. Performance of the network under the more effective of the two new selectivity adjustment techniques suggests that the network fails to exploit the features that distinguish different classes of input data. To avoid this shortcoming, the network's final layer cells were replaced by a nonlinear classifier (a multilayer perceptron) to create a hybrid architecture. Tests of Fukushima's original system and the novel systems proposed in this paper suggest that it may be difficult for the neocognitron to achieve the performance of existing digit classifiers due to its reliance upon the supervisor's choice of selectivity parameters and training data. These findings pertain to Fukushima's implementation of the system and should not be seen as diminishing the practical significance of the concept of hierarchical feature extraction embodied in the neocognitron. © 1997 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Speech recognition in car environments has been identified as a valuable means for reducing driver distraction when operating noncritical in-car systems. Under such conditions, however, speech recognition accuracy degrades significantly, and techniques such as speech enhancement are required to improve these accuracies. Likelihood-maximizing (LIMA) frameworks optimize speech enhancement algorithms based on recognized state sequences rather than traditional signal-level criteria such as maximizing signal-to-noise ratio. LIMA frameworks typically require calibration utterances to generate optimized enhancement parameters that are used for all subsequent utterances. Under such a scheme, suboptimal recognition performance occurs in noise conditions that are significantly different from that present during the calibration session – a serious problem in rapidly changing noise environments out on the open road. In this chapter, we propose a dialog-based design that allows regular optimization iterations in order to track the ever-changing noise conditions. Experiments using Mel-filterbank noise subtraction (MFNS) are performed to determine the optimization requirements for vehicular environments and show that minimal optimization is required to improve speech recognition, avoid over-optimization, and ultimately assist with semireal-time operation. It is also shown that the proposed design is able to provide improved recognition performance over frameworks incorporating a calibration session only.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Empirical evidence suggests impaired facial emotion recognition in schizophrenia. However, the nature of this deficit is the subject of ongoing research. The current study tested the hypothesis that a generalized deficit at an early stage of face-specific processing (i.e. putatively subserved by the fusiform gyrus) accounts for impaired facial emotion recognition in schizophrenia as opposed to the Negative Emotion-specific Deficit Model, which suggests impaired facial information processing at subsequent stages. Event-related potentials (ERPs) were recorded from 11 schizophrenia patients and 15 matched controls while performing a gender discrimination and a facial emotion recognition task. Significant reduction of the face-specific vertex positive potential (VPP) at a peak latency of 165 ms was confirmed in schizophrenia subjects whereas their early visual processing, as indexed by P1, was found to be intact. Attenuated VPP was found to correlate with subsequent P3 amplitude reduction and to predict accuracy when performing a facial emotion discrimination task. A subset of ten schizophrenia patients and ten matched healthy control subjects also performed similar tasks in the magnetic resonance imaging scanner. Patients showed reduced blood oxygenation level-dependent (BOLD) activation in the fusiform, inferior frontal, middle temporal and middle occipital gyrus as well as in the amygdala. Correlation analyses revealed that VPP and the subsequent P3a ERP components predict fusiform gyrus BOLD activation. These results suggest that problems in facial affect recognition in schizophrenia may represent flow-on effects of a generalized deficit in early visual processing.