944 resultados para Recognition accuracy


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we present a novel method for performing speaker recognition with very limited training data and in the presence of background noise. Similarity-based speaker recognition is considered so that speaker models can be created with limited training speech data. The proposed similarity is a form of cosine similarity used as a distance measure between speech feature vectors. Each speech frame is modelled using subband features, and into this framework, multicondition training and optimal feature selection are introduced, making the system capable of performing speaker recognition in the presence of realistic, time-varying noise, which is unknown during training. Speaker identi?cation experiments were carried out using the SPIDRE database. The performance of the proposed new system for noise compensation is compared to that of an oracle model; the speaker identi?cation accuracy for clean speech by the new system trained with limited training data is compared to that of a GMM trained with several minutes of speech. Both comparisons have demonstrated the effectiveness of the new model. Finally, experiments were carried out to test the new model for speaker identi?cation given limited training data and with differing levels and types of realistic background noise. The results have demonstrated the robustness of the new system.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we demonstrate a simple and novel illumination model that can be used for illumination invariant facial recognition. This model requires no prior knowledge of the illumination conditions and can be used when there is only a single training image per-person. The proposed illumination model separates the effects of illumination over a small area of the face into two components; an additive component modelling the mean illumination and a multiplicative component, modelling the variance within the facial area. Illumination invariant facial recognition is performed in a piecewise manner, by splitting the face image into blocks, then normalizing the illumination within each block based on the new lighting model. The assumptions underlying this novel lighting model have been verified on the YaleB face database. We show that magnitude 2D Fourier features can be used as robust facial descriptors within the new lighting model. Using only a single training image per-person, our new method achieves high (in most cases 100%) identification accuracy on the YaleB, extended YaleB and CMU-PIE face databases.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Human listeners seem to have an impressive ability to recognize a wide variety of natural sounds. However, there is surprisingly little quantitative evidence to characterize this fundamental ability. Here the speed and accuracy of musical-sound recognition were measured psychophysically with a rich but acoustically balanced stimulus set. The set comprised recordings of notes from musical instruments and sung vowels. In a first experiment, reaction times were collected for three target categories: voice, percussion, and strings. In a go/no-go task, listeners reacted as quickly as possible to members of a target category while withholding responses to distractors (a diverse set of musical instruments). Results showed near-perfect accuracy and fast reaction times, particularly for voices. In a second experiment, voices were recognized among strings and vice-versa. Again, reaction times to voices were faster. In a third experiment, auditory chimeras were created to retain only spectral or temporal features of the voice. Chimeras were recognized accurately, but not as quickly as natural voices. Altogether, the data suggest rapid and accurate neural mechanisms for musical-sound recognition based on selectivity to complex spectro-temporal signatures of sound sources.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Smart Spaces, Ambient Intelligence, and Ambient Assisted Living are environmental paradigms that strongly depend on their capability to recognize human actions. While most solutions rest on sensor value interpretations and video analysis applications, few have realized the importance of incorporating common-sense capabilities to support the recognition process. Unfortunately, human action recognition cannot be successfully accomplished by only analyzing body postures. On the contrary, this task should be supported by profound knowledge of human agency nature and its tight connection to the reasons and motivations that explain it. The combination of this knowledge and the knowledge about how the world works is essential for recognizing and understanding human actions without committing common-senseless mistakes. This work demonstrates the impact that episodic reasoning has in improving the accuracy of a computer vision system for human action recognition. This work also presents formalization, implementation, and evaluation details of the knowledge model that supports the episodic reasoning.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This chapter describes an experimental system for the recognition of human faces from surveillance video. In surveillance applications, the system must be robust to changes in illumination, scale, pose and expression. The system must also be able to perform detection and recognition rapidly in real time. Our system detects faces using the Viola-Jones face detector, then extracts local features to build a shape-based feature vector. The feature vector is constructed from ratios of lengths and differences in tangents of angles, so as to be robust to changes in scale and rotations in-plane and out-of-plane. Consideration was given to improving the performance and accuracy of both the detection and recognition steps.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recent work suggests that the human ear varies significantly between different subjects and can be used for identification. In principle, therefore, using ears in addition to the face within a recognition system could improve accuracy and robustness, particularly for non-frontal views. The paper describes work that investigates this hypothesis using an approach based on the construction of a 3D morphable model of the head and ear. One issue with creating a model that includes the ear is that existing training datasets contain noise and partial occlusion. Rather than exclude these regions manually, a classifier has been developed which automates this process. When combined with a robust registration algorithm the resulting system enables full head morphable models to be constructed efficiently using less constrained datasets. The algorithm has been evaluated using registration consistency, model coverage and minimalism metrics, which together demonstrate the accuracy of the approach. To make it easier to build on this work, the source code has been made available online.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Diagnosis of meningococcal disease relies on recognition of clinical signs and symptoms that are notoriously non-specific, variable, and often absent in the early stages of the disease. Loop-mediated isothermal amplification (LAMP) has previously been shown to be fast and effective for the molecular detection of meningococcal DNA in clinical specimens. We aimed to assess the diagnostic accuracy of meningococcal LAMP as a near-patient test in the emergency department.

Methods: For this observational cohort study of diagnostic accuracy, children aged 0-13 years presenting to the emergency department of the Royal Belfast Hospital for Sick Children (Belfast, UK) with suspected meningococcal disease were eligible for inclusion. Patients underwent a standard meningococcal pack of investigations testing for meningococcal disease. Respiratory (nasopharyngeal swab) and blood specimens were collected from patients and tested with near-patient meningococcal LAMP and the results were compared with those obtained by reference laboratory tests (culture and PCR of blood and cerebrospinal fluid).

Findings: Between Nov 1, 2009, and Jan 31, 2012, 161 eligible children presenting at the hospital underwent the meningococcal pack of investigations and were tested for meningococcal disease, of whom 148 consented and were enrolled in the study. Combined testing of respiratory and blood specimens with use of LAMP was accurate (sensitivity 89% [95% CI 72-96], specificity 100% [97-100], positive predictive value 100% [85-100]; negative predictive value 98% [93-99]) and diagnostically useful (positive likelihood ratio 213 [95% CI 13-infinity] and negative likelihood ratio 0·11 [0·04-0·32]). The median time required for near-patient testing from sample to result was 1 h 26 min (IQR 1 h 20 min-1 h 32 min).

Interpretation: Meningococcal LAMP is straightforward enough for use in any hospital with basic laboratory facilities, and near-patient testing with this method is both feasible and effective. By contrast with existing UK National Institute of Health and Care Excellence guidelines, we showed that molecular testing of non-invasive respiratory specimens from children is diagnostically accurate and clinically useful.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Studies have been carried out to recognize individuals from a frontal view using their gait patterns. In previous work, gait sequences were captured using either single or stereo RGB camera systems or the Kinect 1.0 camera system. In this research, we used a new frontal view gait recognition method using a laser based Time of Flight (ToF) camera. In addition to the new gait data set, other contributions include enhancement of the silhouette segmentation, gait cycle estimation and gait image representations. We propose four new gait image representations namely Gait Depth Energy Image (GDE), Partial GDE (PGDE), Discrete Cosine Transform GDE (DGDE) and Partial DGDE (PDGDE). The experimental results show that all the proposed gait image representations produce better accuracy than the previous methods. In addition, we have also developed Fusion GDEs (FGDEs) which achieve better overall accuracy and outperform the previous methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

On-line handwriting recognition has been a frontier area of research for the last few decades under the purview of pattern recognition. Word processing turns to be a vexing experience even if it is with the assistance of an alphanumeric keyboard in Indian languages. A natural solution for this problem is offered through online character recognition. There is abundant literature on the handwriting recognition of western, Chinese and Japanese scripts, but there are very few related to the recognition of Indic script such as Malayalam. This paper presents an efficient Online Handwritten character Recognition System for Malayalam Characters (OHR-M) using K-NN algorithm. It would help in recognizing Malayalam text entered using pen-like devices. A novel feature extraction method, a combination of time domain features and dynamic representation of writing direction along with its curvature is used for recognizing Malayalam characters. This writer independent system gives an excellent accuracy of 98.125% with recognition time of 15-30 milliseconds

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we propose a handwritten character recognition system for Malayalam language. The feature extraction phase consists of gradient and curvature calculation and dimensionality reduction using Principal Component Analysis. Directional information from the arc tangent of gradient is used as gradient feature. Strength of gradient in curvature direction is used as the curvature feature. The proposed system uses a combination of gradient and curvature feature in reduced dimension as the feature vector. For classification, discriminative power of Support Vector Machine (SVM) is evaluated. The results reveal that SVM with Radial Basis Function (RBF) kernel yield the best performance with 96.28% and 97.96% of accuracy in two different datasets. This is the highest accuracy ever reported on these datasets

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Performance of any continuous speech recognition system is dependent on the accuracy of its acoustic model. Hence, preparation of a robust and accurate acoustic model lead to satisfactory recognition performance for a speech recognizer. In acoustic modeling of phonetic unit, context information is of prime importance as the phonemes are found to vary according to the place of occurrence in a word. In this paper we compare and evaluate the effect of context dependent tied (CD tied) models, context dependent (CD) and context independent (CI) models in the perspective of continuous speech recognition of Malayalam language. The database for the speech recognition system has utterance from 21 speakers including 11 female and 10 males. Our evaluation results show that CD tied models outperforms CI models over 21%.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Development of Malayalam speech recognition system is in its infancy stage; although many works have been done in other Indian languages. In this paper we present the first work on speaker independent Malayalam isolated speech recognizer based on PLP (Perceptual Linear Predictive) Cepstral Coefficient and Hidden Markov Model (HMM). The performance of the developed system has been evaluated with different number of states of HMM (Hidden Markov Model). The system is trained with 21 male and female speakers in the age group ranging from 19 to 41 years. The system obtained an accuracy of 99.5% with the unseen data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A connected digit speech recognition is important in many applications such as automated banking system, catalogue-dialing, automatic data entry, automated banking system, etc. This paper presents an optimum speaker-independent connected digit recognizer forMalayalam language. The system employs Perceptual Linear Predictive (PLP) cepstral coefficient for speech parameterization and continuous density Hidden Markov Model (HMM) in the recognition process. Viterbi algorithm is used for decoding. The training data base has the utterance of 21 speakers from the age group of 20 to 40 years and the sound is recorded in the normal office environment where each speaker is asked to read 20 set of continuous digits. The system obtained an accuracy of 99.5 % with the unseen data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents an efficient Online Handwritten character Recognition System for Malayalam Characters (OHR-M) using Kohonen network. It would help in recognizing Malayalam text entered using pen-like devices. It will be more natural and efficient way for users to enter text using a pen than keyboard and mouse. To identify the difference between similar characters in Malayalam a novel feature extraction method has been adopted-a combination of context bitmap and normalized (x, y) coordinates. The system reported an accuracy of 88.75% which is writer independent with a recognition time of 15-32 milliseconds

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Ecological validity of static and intense facial expressions in emotional recognition has been questioned. Recent studies have recommended the use of facial stimuli more compatible to the natural conditions of social interaction, which involves motion and variations in emotional intensity. In this study, we compared the recognition of static and dynamic facial expressions of happiness, fear, anger and sadness, presented in four emotional intensities (25 %, 50 %, 75 % and 100 %). Twenty volunteers (9 women and 11 men), aged between 19 and 31 years, took part in the study. The experiment consisted of two sessions in which participants had to identify the emotion of static (photographs) and dynamic (videos) displays of facial expressions on the computer screen. The mean accuracy was submitted to an Anova for repeated measures of model: 2 sexes x [2 conditions x 4 expressions x 4 intensities]. We observed an advantage for the recognition of dynamic expressions of happiness and fear compared to the static stimuli (p < .05). Analysis of interactions showed that expressions with intensity of 25 % were better recognized in the dynamic condition (p < .05). The addition of motion contributes to improve recognition especially in male participants (p < .05). We concluded that the effect of the motion varies as a function of the type of emotion, intensity of the expression and sex of the participant. These results support the hypothesis that dynamic stimuli have more ecological validity and are more appropriate to the research with emotions.