986 resultados para Automatic Recognition


Relevância:

80.00% 80.00%

Publicador:

Resumo:

The study of acoustic communication in animals often requires not only the recognition of species specific acoustic signals but also the identification of individual subjects, all in a complex acoustic background. Moreover, when very long recordings are to be analyzed, automatic recognition and identification processes are invaluable tools to extract the relevant biological information. A pattern recognition methodology based on hidden Markov models is presented inspired by successful results obtained in the most widely known and complex acoustical communication signal: human speech. This methodology was applied here for the first time to the detection and recognition of fish acoustic signals, specifically in a stream of round-the-clock recordings of Lusitanian toadfish (Halobatrachus didactylus) in their natural estuarine habitat. The results show that this methodology is able not only to detect the mating sounds (boatwhistles) but also to identify individual male toadfish, reaching an identification rate of ca. 95%. Moreover this method also proved to be a powerful tool to assess signal durations in large data sets. However, the system failed in recognizing other sound types.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Automatic recognition of people is an active field of research with important forensic and security applications. In these applications, it is not always possible for the subject to be in close proximity to the system. Voice represents a human behavioural trait which can be used to recognise people in such situations. Automatic Speaker Verification (ASV) is the process of verifying a persons identity through the analysis of their speech and enables recognition of a subject at a distance over a telephone channel { wired or wireless. A significant amount of research has focussed on the application of Gaussian mixture model (GMM) techniques to speaker verification systems providing state-of-the-art performance. GMM's are a type of generative classifier trained to model the probability distribution of the features used to represent a speaker. Recently introduced to the field of ASV research is the support vector machine (SVM). An SVM is a discriminative classifier requiring examples from both positive and negative classes to train a speaker model. The SVM is based on margin maximisation whereby a hyperplane attempts to separate classes in a high dimensional space. SVMs applied to the task of speaker verification have shown high potential, particularly when used to complement current GMM-based techniques in hybrid systems. This work aims to improve the performance of ASV systems using novel and innovative SVM-based techniques. Research was divided into three main themes: session variability compensation for SVMs; unsupervised model adaptation; and impostor dataset selection. The first theme investigated the differences between the GMM and SVM domains for the modelling of session variability | an aspect crucial for robust speaker verification. Techniques developed to improve the robustness of GMMbased classification were shown to bring about similar benefits to discriminative SVM classification through their integration in the hybrid GMM mean supervector SVM classifier. Further, the domains for the modelling of session variation were contrasted to find a number of common factors, however, the SVM-domain consistently provided marginally better session variation compensation. Minimal complementary information was found between the techniques due to the similarities in how they achieved their objectives. The second theme saw the proposal of a novel model for the purpose of session variation compensation in ASV systems. Continuous progressive model adaptation attempts to improve speaker models by retraining them after exploiting all encountered test utterances during normal use of the system. The introduction of the weight-based factor analysis model provided significant performance improvements of over 60% in an unsupervised scenario. SVM-based classification was then integrated into the progressive system providing further benefits in performance over the GMM counterpart. Analysis demonstrated that SVMs also hold several beneficial characteristics to the task of unsupervised model adaptation prompting further research in the area. In pursuing the final theme, an innovative background dataset selection technique was developed. This technique selects the most appropriate subset of examples from a large and diverse set of candidate impostor observations for use as the SVM background by exploiting the SVM training process. This selection was performed on a per-observation basis so as to overcome the shortcoming of the traditional heuristic-based approach to dataset selection. Results demonstrate the approach to provide performance improvements over both the use of the complete candidate dataset and the best heuristically-selected dataset whilst being only a fraction of the size. The refined dataset was also shown to generalise well to unseen corpora and be highly applicable to the selection of impostor cohorts required in alternate techniques for speaker verification.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Novel techniques have been developed for the automatic recognition of human behaviour in challenging environments using information from visual and infra-red camera feeds. The techniques have been applied to two interesting scenarios: Recognise drivers' speech using lip movements and recognising audience behaviour, while watching a movie, using facial features and body movements. Outcome of the research in these two areas will be useful in the improving the performance of voice recognition in automobiles for voice based control and for obtaining accurate movie interest ratings based on live audience response analysis.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper, we describe a system for the automatic recognition of isolated handwritten Devanagari characters obtained by linearizing consonant conjuncts. Owing to the large number of characters and resulting demands on data acquisition, we use structural recognition techniques to reduce some characters to others. The residual characters are then classified using the subspace method. Finally the results of structural recognition and feature-based matching are mapped to give final output. The proposed system Ifs evaluated for the writer dependent scenario.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In spite of over two decades of intense research, illumination and pose invariance remain prohibitively challenging aspects of face recognition for most practical applications. The objective of this work is to recognize faces using video sequences both for training and recognition input, in a realistic, unconstrained setup in which lighting, pose and user motion pattern have a wide variability and face images are of low resolution. In particular there are three areas of novelty: (i) we show how a photometric model of image formation can be combined with a statistical model of generic face appearance variation, learnt offline, to generalize in the presence of extreme illumination changes; (ii) we use the smoothness of geodesically local appearance manifold structure and a robust same-identity likelihood to achieve invariance to unseen head poses; and (iii) we introduce an accurate video sequence "reillumination" algorithm to achieve robustness to face motion patterns in video. We describe a fully automatic recognition system based on the proposed method and an extensive evaluation on 171 individuals and over 1300 video sequences with extreme illumination, pose and head motion variation. On this challenging data set our system consistently demonstrated a nearly perfect recognition rate (over 99.7%), significantly outperforming state-of-the-art commercial software and methods from the literature. © Springer-Verlag Berlin Heidelberg 2006.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In spite of over two decades of intense research, illumination and pose invariance remain prohibitively challenging aspects of face recognition for most practical applications. The objective of this work is to recognize faces using video sequences both for training and recognition input, in a realistic, unconstrained setup in which lighting, pose and user motion pattern have a wide variability and face images are of low resolution. The central contribution is an illumination invariant, which we show to be suitable for recognition from video of loosely constrained head motion. In particular there are three contributions: (i) we show how a photometric model of image formation can be combined with a statistical model of generic face appearance variation to exploit the proposed invariant and generalize in the presence of extreme illumination changes; (ii) we introduce a video sequence re-illumination algorithm to achieve fine alignment of two video sequences; and (iii) we use the smoothness of geodesically local appearance manifold structure and a robust same-identity likelihood to achieve robustness to unseen head poses. We describe a fully automatic recognition system based on the proposed method and an extensive evaluation on 323 individuals and 1474 video sequences with extreme illumination, pose and head motion variation. Our system consistently achieved a nearly perfect recognition rate (over 99.7% on all four databases). © 2012 Elsevier Ltd All rights reserved.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In spite of over two decades of intense research, illumination and pose invariance remain prohibitively challenging aspects of face recognition for most practical applications. The objective of this work is to recognize faces using video sequences both for training and recognition input, in a realistic, unconstrained setup in which lighting, pose and user motion pattern have a wide variability and face images are of low resolution. In particular there are three areas of novelty: (i) we show how a photometric model of image formation can be combined with a statistical model of generic face appearance variation, learnt offline, to generalize in the presence of extreme illumination changes; (ii) we use the smoothness of geodesically local appearance manifold structure and a robust same-identity likelihood to achieve invariance to unseen head poses; and (iii) we introduce an accurate video sequence “reillumination” algorithm to achieve robustness to face motion patterns in video. We describe a fully automatic recognition system based on the proposed method and an extensive evaluation on 171 individuals and over 1300 video sequences with extreme illumination, pose and head motion variation. On this challenging data set our system consistently demonstrated a nearly perfect recognition rate (over 99.7%), significantly outperforming state-of-the-art commercial software and methods from the literature

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The objective of this work is to recognize faces using video sequences both for training and novel input, in a realistic, unconstrained setup in which lighting, pose and user motion pattern have a wide variability and face images are of low resolution. There are three major areas of novelty: (i) illumination generalization is achieved by combining coarse histogram correction with fine illumination manifold-based normalization; (ii) pose robustness is achieved by decomposing each appearance manifold into semantic Gaussian pose clusters, comparing the corresponding clusters and fusing the results using an RBF network; (iii) a fully automatic recognition system based on the proposed method is described and extensively evaluated on 600 head motion video sequences with extreme illumination, pose and motion pattern variation. On this challenging data set our system consistently demonstrated a very high recognition rate (95% on average), significantly outperforming state-of-the-art methods from the literature.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

A method for automatic scaling of oblique ionograms has been introduced. This method also provides a rejection procedure for ionograms that are considered to lack sufficient information, depicting a very good success rate. Observing the Kp index of each autoscaled ionogram, can be noticed that the behavior of the autoscaling program does not depend on geomagnetic conditions. The comparison between the values of the MUF provided by the presented software and those obtained by an experienced operator indicate that the procedure developed for detecting the nose of oblique ionogram traces is sufficiently efficient and becomes much more efficient as the quality of the ionograms improves. These results demonstrate the program allows the real-time evaluation of MUF values associated with a particular radio link through an oblique radio sounding. The automatic recognition of a part of the trace allows determine for certain frequencies, the time taken by the radio wave to travel the path between the transmitter and receiver. The reconstruction of the ionogram traces, suggests the possibility of estimating the electron density between the transmitter and the receiver, from an oblique ionogram. The showed results have been obtained with a ray-tracing procedure based on the integration of the eikonal equation and using an analytical ionospheric model with free parameters. This indicates the possibility of applying an adaptive model and a ray-tracing algorithm to estimate the electron density in the ionosphere between the transmitter and the receiver An additional study has been conducted on a high quality ionospheric soundings data set and another algorithm has been designed for the conversion of an oblique ionogram into a vertical one, using Martyn's theorem. This allows a further analysis of oblique soundings, throw the use of the INGV Autoscala program for the automatic scaling of vertical ionograms.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This work explores the automatic recognition of physical activity intensity patterns from multi-axial accelerometry and heart rate signals. Data collection was carried out in free-living conditions and in three controlled gymnasium circuits, for a total amount of 179.80 h of data divided into: sedentary situations (65.5%), light-to-moderate activity (17.6%) and vigorous exercise (16.9%). The proposed machine learning algorithms comprise the following steps: time-domain feature definition, standardization and PCA projection, unsupervised clustering (by k-means and GMM) and a HMM to account for long-term temporal trends. Performance was evaluated by 30 runs of a 10-fold cross-validation. Both k-means and GMM-based approaches yielded high overall accuracy (86.97% and 85.03%, respectively) and, given the imbalance of the dataset, meritorious F-measures (up to 77.88%) for non-sedentary cases. Classification errors tended to be concentrated around transients, what constrains their practical impact. Hence, we consider our proposal to be suitable for 24 h-based monitoring of physical activity in ambulatory scenarios and a first step towards intensity-specific energy expenditure estimators

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper addresses the problem of the automatic recognition and classification of temporal expressions and events in human language. Efficacy in these tasks is crucial if the broader task of temporal information processing is to be successfully performed. We analyze whether the application of semantic knowledge to these tasks improves the performance of current approaches. We therefore present and evaluate a data-driven approach as part of a system: TIPSem. Our approach uses lexical semantics and semantic roles as additional information to extend classical approaches which are principally based on morphosyntax. The results obtained for English show that semantic knowledge aids in temporal expression and event recognition, achieving an error reduction of 59% and 21%, while in classification the contribution is limited. From the analysis of the results it may be concluded that the application of semantic knowledge leads to more general models and aids in the recognition of temporal entities that are ambiguous at shallower language analysis levels. We also discovered that lexical semantics and semantic roles have complementary advantages, and that it is useful to combine them. Finally, we carried out the same analysis for Spanish. The results obtained show comparable advantages. This supports the hypothesis that applying the proposed semantic knowledge may be useful for different languages.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

To date, automatic recognition of semantic information such as salient objects and mid-level concepts from images is a challenging task. Since real-world objects tend to exist in a context within their environment, the computer vision researchers have increasingly incorporated contextual information for improving object recognition. In this paper, we present a method to build a visual contextual ontology from salient objects descriptions for image annotation. The ontologies include not only partOf/kindOf relations, but also spatial and co-occurrence relations. A two-step image annotation algorithm is also proposed based on ontology relations and probabilistic inference. Different from most of the existing work, we specially exploit how to combine representation of ontology, contextual knowledge and probabilistic inference. The experiments show that image annotation results are improved in the LabelMe dataset.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Acoustic sensors allow scientists to scale environmental monitoring over large spatiotemporal scales. The faunal vocalisations captured by these sensors can answer ecological questions, however, identifying these vocalisations within recorded audio is difficult: automatic recognition is currently intractable and manual recognition is slow and error prone. In this paper, a semi-automated approach to call recognition is presented. An automated decision support tool is tested that assists users in the manual annotation process. The respective strengths of human and computer analysis are used to complement one another. The tool recommends the species of an unknown vocalisation and thereby minimises the need for the memorization of a large corpus of vocalisations. In the case of a folksonomic tagging system, recommending species tags also minimises the proliferation of redundant tag categories. We describe two algorithms: (1) a “naïve” decision support tool (16%–64% sensitivity) with efficiency of O(n) but which becomes unscalable as more data is added and (2) a scalable alternative with 48% sensitivity and an efficiency ofO(log n). The improved algorithm was also tested in a HTML-based annotation prototype. The result of this work is a decision support tool for annotating faunal acoustic events that may be utilised by other bioacoustics projects.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Somente no ano de 2011 foram adquiridos mais de 1.000TB de novos registros digitais de imagem advindos de Sensoriamento Remoto orbital. Tal gama de registros, que possui uma progressão geométrica crescente, é adicionada, anualmente, a incrível e extraordinária massa de dados de imagens orbitais já existentes da superfície da Terra (adquiridos desde a década de 70 do século passado). Esta quantidade maciça de registros, onde a grande maioria sequer foi processada, requer ferramentas computacionais que permitam o reconhecimento automático de padrões de imagem desejados, de modo a permitir a extração dos objetos geográficos e de alvos de interesse, de forma mais rápida e concisa. A proposta de tal reconhecimento ser realizado automaticamente por meio da integração de técnicas de Análise Espectral e de Inteligência Computacional com base no Conhecimento adquirido por especialista em imagem foi implementada na forma de um integrador com base nas técnicas de Redes Neurais Computacionais (ou Artificiais) (através do Mapa de Características Auto- Organizáveis de Kohonen SOFM) e de Lógica Difusa ou Fuzzy (através de Mamdani). Estas foram aplicadas às assinaturas espectrais de cada padrão de interesse, formadas pelos níveis de quantização ou níveis de cinza do respectivo padrão em cada uma das bandas espectrais, de forma que a classificação dos padrões irá depender, de forma indissociável, da correlação das assinaturas espectrais nas seis bandas do sensor, tal qual o trabalho dos especialistas em imagens. Foram utilizadas as bandas 1 a 5 e 7 do satélite LANDSAT-5 para a determinação de cinco classes/alvos de interesse da cobertura e ocupação terrestre em três recortes da área-teste, situados no Estado do Rio de Janeiro (Guaratiba, Mangaratiba e Magé) nesta integração, com confrontação dos resultados obtidos com aqueles derivados da interpretação da especialista em imagens, a qual foi corroborada através de verificação da verdade terrestre. Houve também a comparação dos resultados obtidos no integrador com dois sistemas computacionais comerciais (IDRISI Taiga e ENVI 4.8), no que tange a qualidade da classificação (índice Kappa) e tempo de resposta. O integrador, com classificações híbridas (supervisionadas e não supervisionadas) em sua implementação, provou ser eficaz no reconhecimento automático (não supervisionado) de padrões multiespectrais e no aprendizado destes padrões, pois para cada uma das entradas dos recortes da área-teste, menor foi o aprendizado necessário para sua classificação alcançar um acerto médio final de 87%, frente às classificações da especialista em imagem. A sua eficácia também foi comprovada frente aos sistemas computacionais testados, com índice Kappa médio de 0,86.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

本文阐明了数字式热轴自动判别及集中检测的原理,并从理论上对一些重点环节进行分析。该系统经过实践检验,证明性能良好,功能齐全,自动判别准确率高。