936 resultados para image recognition
An approach to statistical lip modelling for speaker identification via chromatic feature extraction
Resumo:
This paper presents a novel technique for the tracking of moving lips for the purpose of speaker identification. In our system, a model of the lip contour is formed directly from chromatic information in the lip region. Iterative refinement of contour point estimates is not required. Colour features are extracted from the lips via concatenated profiles taken around the lip contour. Reduction of order in lip features is obtained via principal component analysis (PCA) followed by linear discriminant analysis (LDA). Statistical speaker models are built from the lip features based on the Gaussian mixture model (GMM). Identification experiments performed on the M2VTS1 database, show encouraging results
Resumo:
Investigates the use of temporal lip information, in conjunction with speech information, for robust, text-dependent speaker identification. We propose that significant speaker-dependent information can be obtained from moving lips, enabling speaker recognition systems to be highly robust in the presence of noise. The fusion structure for the audio and visual information is based around the use of multi-stream hidden Markov models (MSHMM), with audio and visual features forming two independent data streams. Recent work with multi-modal MSHMMs has been performed successfully for the task of speech recognition. The use of temporal lip information for speaker identification has been performed previously (T.J. Wark et al., 1998), however this has been restricted to output fusion via single-stream HMMs. We present an extension to this previous work, and show that a MSHMM is a valid structure for multi-modal speaker identification
Resumo:
Investigates the use of lip information, in conjunction with speech information, for robust speaker verification in the presence of background noise. We have previously shown (Int. Conf. on Acoustics, Speech and Signal Proc., vol. 6, pp. 3693-3696, May 1998) that features extracted from a speaker's moving lips hold speaker dependencies which are complementary with speech features. We demonstrate that the fusion of lip and speech information allows for a highly robust speaker verification system which outperforms either subsystem individually. We present a new technique for determining the weighting to be applied to each modality so as to optimize the performance of the fused system. Given a correct weighting, lip information is shown to be highly effective for reducing the false acceptance and false rejection error rates in the presence of background noise
Resumo:
The UAV challenge takes place every year. Teams of compteitors compete to use an Unmanned Airborne Vehicle to locate a simulated lost person and deliver water.
Resumo:
In many parts of the world, uncontrolled fires in sparsely populated areas are a major concern as they can quickly grow into large and destructive conflagrations in short time spans. Detecting these fires has traditionally been a job for trained humans on the ground, or in the air. In many cases, these manned solutions are simply not able to survey the amount of area necessary to maintain sufficient vigilance and coverage. This paper investigates the use of unmanned aerial systems (UAS) for automated wildfire detection. The proposed system uses low-cost, consumer-grade electronics and sensors combined with various airframes to create a system suitable for automatic detection of wildfires. The system employs automatic image processing techniques to analyze captured images and autonomously detect fire-related features such as fire lines, burnt regions, and flammable material. This image recognition algorithm is designed to cope with environmental occlusions such as shadows, smoke and obstructions. Once the fire is identified and classified, it is used to initialize a spatial/temporal fire simulation. This simulation is based on occupancy maps whose fidelity can be varied to include stochastic elements, various types of vegetation, weather conditions, and unique terrain. The simulations can be used to predict the effects of optimized firefighting methods to prevent the future propagation of the fires and greatly reduce time to detection of wildfires, thereby greatly minimizing the ensuing damage. This paper also documents experimental flight tests using a SenseFly Swinglet UAS conducted in Brisbane, Australia as well as modifications for custom UAS.
Resumo:
A quantificação colorimétrica da pele do rosto humano apresenta uma grande dispersão de valores. Esta dispersão varia de acordo com o espaço de cor (HSV ou YCbCr) adotado para a análise e quanto menor a dispersão mais adequado é o espaço ao reconhecimento facial. O objetivo deste trabalho é analisar a distribuição estatística da colorimetria de imagens de rostos digitalizadas. A análise poderá dizer se as coordenadas de cor, tais como saturação, matiz e valor podem auxiliar em técnicas de reconhecimento de faces. Como resultado da análise, espera-se concluir qual dos sistemas de coordenadas de cor (HSV ou YCbCr) é o mais adequado à aplicações em reconhecimento facial. Os resultados obtidos serão apresentados com fundamentação no design da informação. O grande número de amostras fotográficas disponíveis para análise (530) e o correto equilíbrio de iluminação, contraste e temperatura de cor constituem o principal diferencial desse trabalho.
Resumo:
Background Pseudomonas syringae can cause stem necrosis and canker in a wide range of woody species including cherry, plum, peach, horse chestnut and ash. The detection and quantification of lesion progression over time in woody tissues is a key trait for breeders to select upon for resistance. Results In this study a general, rapid and reliable approach to lesion quantification using image recognition and an artificial neural network model was developed. This was applied to screen both the virulence of a range of P. syringae pathovars and the resistance of a set of cherry and plum accessions to bacterial canker. The method developed was more objective than scoring by eye and allowed the detection of putatively resistant plant material for further study. Conclusions Automated image analysis will facilitate rapid screening of material for resistance to bacterial and other phytopathogens, allowing more efficient selection and quantification of resistance responses.
Resumo:
This work seeks to demonstrate the advantages in functional software test automation using Sikuli tool, which uses image recognition to find the graphical elements of a system, in addition to using a custom library with methods made to automate the summarization of obtained results through the tests and their evidence
Resumo:
In questa Tesi di laurea, si è affrontato il problema della mobilità veicolare in caso di nebbie. Si è quindi sviluppato un prototipo con architettura Client-Server, che si è soffermato maggiormente sull’analisi dei dati per la creazione di un percorso alternativo. Si è preso in considerazione il sistema operativo mobile di Apple, iOS7 che rappresenta uno dei Sistemi Operativi mobili maggiormente presenti sul mercato oggigiorno e che possiede un buon bacino di utenze. La parte Server è stata sviluppata secondo l’architettura REST; è presente un Server HTTP che riceve richieste e risponde in modo adeguato ai Client tramite lo scambio bidirezionale di dati in formato JSON. Nella parte Server è inclusa la base di dati: un componente molto importante poiché implementa al suo interno, parte della logica di Sistema tramite stored procedure. La parte Client è un’applicazione per dispositivi iPad e iPhone chiamata Fog Escaping; essa è stata sviluppata secondo il pattern MVC (Model- View-Controller). Fog Escaping implementa un algoritmo Greedy di ricerca del percorso alternativo, che può essere utilizzato per diverse tipologie di applicazioni.
Resumo:
An extensive study of the morphology and the dynamics of the equatorial ionosphere over South America is presented here. A multi parametric approach is used to describe the physical characteristics of the ionosphere in the regions where the combination of the thermospheric electric field and the horizontal geomagnetic field creates the so-called Equatorial Ionization Anomalies. Ground based measurements from GNSS receivers are used to link the Total Electron Content (TEC), its spatial gradients and the phenomenon known as scintillation that can lead to a GNSS signal degradation or even to a GNSS signal ‘loss of lock’. A new algorithm to highlight the features characterizing the TEC distribution is developed in the framework of this thesis and the results obtained are validated and used to improve the performance of a GNSS positioning technique (long baseline RTK). In addition, the correlation between scintillation and dynamics of the ionospheric irregularities is investigated. By means of a software, here implemented, the velocity of the ionospheric irregularities is evaluated using high sampling rate GNSS measurements. The results highlight the parallel behaviour of the amplitude scintillation index (S4) occurrence and the zonal velocity of the ionospheric irregularities at least during severe scintillations conditions (post-sunset hours). This suggests that scintillations are driven by TEC gradients as well as by the dynamics of the ionospheric plasma. Finally, given the importance of such studies for technological applications (e.g. GNSS high-precision applications), a validation of the NeQuick model (i.e. the model used in the new GALILEO satellites for TEC modelling) is performed. The NeQuick performance dramatically improves when data from HF radar sounding (ionograms) are ingested. A custom designed algorithm, based on the image recognition technique, is developed to properly select the ingested data, leading to further improvement of the NeQuick performance.
Resumo:
Il problema che si vuole affrontare è la progettazione e lo sviluppo di un sistema interattivo volto all’apprendimento e alla visita guidata di città d’arte. Si vuole realizzare un’applicazione per dispositivi mobili che offra sia il servizio di creazione di visite guidate che l’utilizzo delle stesse in assenza di connessione internet. Per rendere l’utilizzo dei servizi offerti più piacevole e divertente si è deciso di realizzare le visite guidate sotto forma di cacce al tesoro fotografiche, le cui tappe consistono in indizi testuali che per essere risolti richiedono risposte di tipo fotografico. Si è inoltre scelto di realizzare una community volta alla condivisione delle cacce al tesoro realizzate e al mantenimento di statistiche di gioco. Il contributo originale di questa tesi consiste nella progettazione e realizzazione di una App Android, denominata GeoPhotoHunt, che sfrutta l’idea della caccia al tesoro fotografica e geo localizzata per facilitare le visite guidate a luoghi di interesse, senza la necessità di una connessione ad internet. Il client viene reso indipendente dal server grazie allo spostamento degli algoritmi di image recognition sul client. Esentare il client dalla necessità di una connessione ad internet permette il suo utilizzo anche in città estere dove solitamente non si ha possibilità di connettersi alla rete.
Resumo:
Obesity is becoming an epidemic phenomenon in most developed countries. The fundamental cause of obesity and overweight is an energy imbalance between calories consumed and calories expended. It is essential to monitor everyday food intake for obesity prevention and management. Existing dietary assessment methods usually require manually recording and recall of food types and portions. Accuracy of the results largely relies on many uncertain factors such as user's memory, food knowledge, and portion estimations. As a result, the accuracy is often compromised. Accurate and convenient dietary assessment methods are still blank and needed in both population and research societies. In this thesis, an automatic food intake assessment method using cameras, inertial measurement units (IMUs) on smart phones was developed to help people foster a healthy life style. With this method, users use their smart phones before and after a meal to capture images or videos around the meal. The smart phone will recognize food items and calculate the volume of the food consumed and provide the results to users. The technical objective is to explore the feasibility of image based food recognition and image based volume estimation. This thesis comprises five publications that address four specific goals of this work: (1) to develop a prototype system with existing methods to review the literature methods, find their drawbacks and explore the feasibility to develop novel methods; (2) based on the prototype system, to investigate new food classification methods to improve the recognition accuracy to a field application level; (3) to design indexing methods for large-scale image database to facilitate the development of new food image recognition and retrieval algorithms; (4) to develop novel convenient and accurate food volume estimation methods using only smart phones with cameras and IMUs. A prototype system was implemented to review existing methods. Image feature detector and descriptor were developed and a nearest neighbor classifier were implemented to classify food items. A reedit card marker method was introduced for metric scale 3D reconstruction and volume calculation. To increase recognition accuracy, novel multi-view food recognition algorithms were developed to recognize regular shape food items. To further increase the accuracy and make the algorithm applicable to arbitrary food items, new food features, new classifiers were designed. The efficiency of the algorithm was increased by means of developing novel image indexing method in large-scale image database. Finally, the volume calculation was enhanced through reducing the marker and introducing IMUs. Sensor fusion technique to combine measurements from cameras and IMUs were explored to infer the metric scale of the 3D model as well as reduce noises from these sensors.
Resumo:
Agricultural intensification has caused a decline in structural elements in European farmland, where natural habitats are increasingly fragmented. The loss of habitat structures has a detrimental effect on biodiversity and affects bat species that depend on vegetation structures for foraging and commuting. We investigated the impact of connectivity and configuration of structural landscape elements on flight activity, species richness and diversity of insectivorous bats and distinguished three bat guilds according to species-specific bioacoustic characteristics. We tested whether bats with shorter-range echolocation were more sensitive to habitat fragmentation than bats with longer-range echolocation. We expected to find different connectivity thresholds for the three guilds and hypothesized that bats prefer linear over patchy landscape elements. Bat activity was quantified using repeated acoustic monitoring in 225 locations at 15 study plots distributed across the Swiss Central Plateau, where connectivity and the shape of landscape elements were determined by spatial analysis (GIS). Spectrograms of bat calls were assigned to species with the software batit by means of image recognition and statistical classification algorithms. Bat activity was significantly higher around landscape elements compared to open control areas. Short- and long-range echolocating bats were more active in well-connected landscapes, but optimal connectivity levels differed between the guilds. Species richness increased significantly with connectivity, while species diversity did not (Shannon's diversity index). Total bat activity was unaffected by the shape of landscape elements. Synthesis and applications. This study highlights the importance of connectivity in farmland landscapes for bats, with shorter-range echolocating bats being particularly sensitive to habitat fragmentation. More structurally diverse landscape elements are likely to reduce population declines of bats and could improve conditions for other declining species, including birds. Activity was highest around optimal values of connectivity, which must be evaluated for the different guilds and spatially targeted for a region's habitat configuration. In a multi-species approach, we recommend the reintroduction of structural elements to increase habitat heterogeneity should become part of agri-environment schemes.
Resumo:
La presente Tesis investiga el campo del reconocimiento automático de imágenes mediante ordenador aplicado al análisis de imágenes médicas en mamografía digital. Hay un interés por desarrollar sistemas de aprendizaje que asistan a los radiólogos en el reconocimiento de las microcalcificaciones para apoyarles en los programas de cribado y prevención del cáncer de mama. Para ello el análisis de las microcalcificaciones se ha revelado como técnica clave de diagnóstico precoz, pero sin embargo el diseño de sistemas automáticos para reconocerlas es complejo por la variabilidad y condiciones de las imágenes mamográficas. En este trabajo se analizan los planteamientos teóricos de diseño de sistemas de reconocimiento de imágenes, con énfasis en los problemas específicos de detección y clasificación de microcalcificaciones. Se ha realizado un estudio que incluye desde las técnicas de operadores morfológicos, redes neuronales, máquinas de vectores soporte, hasta las más recientes de aprendizaje profundo mediante redes neuronales convolucionales, contemplando la importancia de los conceptos de escala y jerarquía a la hora del diseño y sus implicaciones en la búsqueda de la arquitectura de conexiones y capas de la red. Con estos fundamentos teóricos y elementos de diseño procedentes de otros trabajos en este área realizados por el autor, se implementan tres sistemas de reconocimiento de mamografías que reflejan una evolución tecnológica, culminando en un sistema basado en Redes Neuronales Convolucionales (CNN) cuya arquitectura se diseña gracias al análisis teórico anterior y a los resultados prácticos de análisis de escalas llevados a cabo en nuestra base de datos de imágenes. Los tres sistemas se entrenan y validan con la base de datos de mamografías DDSM, con un total de 100 muestras de entrenamiento y 100 de prueba escogidas para evitar sesgos y reflejar fielmente un programa de cribado. La validez de las CNN para el problema que nos ocupa queda demostrada y se propone un camino de investigación para el diseño de su arquitectura. ABSTRACT This Dissertation investigates the field of computer image recognition applied to medical imaging in mammography. There is an interest in developing learning systems to assist radiologists in recognition of microcalcifications to help them in screening programs for prevention of breast cancer. Analysis of microcalcifications has emerged as a key technique for early diagnosis of breast cancer, but the design of automatic systems to recognize them is complicated by the variability and conditions of mammographic images. In this Thesis the theoretical approaches to design image recognition systems are discussed, with emphasis on the specific problems of detection and classification of microcalcifications. Our study includes techniques ranging from morphological operators, neural networks and support vector machines, to the most recent deep convolutional neural networks. We deal with learning theory by analyzing the importance of the concepts of scale and hierarchy at the design stage and its implications in the search for the architecture of connections and network layers. With these theoretical facts and design elements coming from other works in this area done by the author, three mammogram recognition systems which reflect technological developments are implemented, culminating in a system based on Convolutional Neural Networks (CNN), whose architecture is designed thanks to the previously mentioned theoretical study and practical results of analysis conducted on scales in our image database. All three systems are trained and validated against the DDSM mammographic database, with a total of 100 training samples and 100 test samples chosen to avoid bias and stand for a real screening program. The validity of the CNN approach to the problem is demonstrated and a research way to help in designing the architecture of these networks is proposed.
Resumo:
This paper proposes a new feature representation method based on the construction of a Confidence Matrix (CM). This representation consists of posterior probability values provided by several weak classifiers, each one trained and used in different sets of features from the original sample. The CM allows the final classifier to abstract itself from discovering underlying groups of features. In this work the CM is applied to isolated character image recognition, for which several set of features can be extracted from each sample. Experimentation has shown that the use of CM permits a significant improvement in accuracy in most cases, while the others remain the same. The results were obtained after experimenting with four well-known corpora, using evolved meta-classifiers with the k-Nearest Neighbor rule as a weak classifier and by applying statistical significance tests.