851 resultados para computer vision face recognition detection voice recognition sistemi biometrici iOS


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Negli ultimi anni si è assistito ad una radicale rivoluzione nell’ambito dei dispositivi di interazione uomo-macchina. Da dispositivi tradizionali come il mouse o la tastiera si è passati allo sviluppo di nuovi sistemi capaci di riconoscere i movimenti compiuti dall’utente (interfacce basate sulla visione o sull’uso di accelerometri) o rilevare il contatto (interfacce di tipo touch). Questi sistemi sono nati con lo scopo di fornire maggiore naturalezza alla comunicazione uomo-macchina. Le nuove interfacce sono molto più espressive di quelle tradizionali poiché sfruttano le capacità di comunicazione naturali degli utenti, su tutte il linguaggio gestuale. Essere in grado di riconoscere gli esseri umani, in termini delle azioni che stanno svolgendo o delle posture che stanno assumendo, apre le porte a una serie vastissima di interessanti applicazioni. Ad oggi sistemi di riconoscimento delle parti del corpo umano e dei gesti sono ampiamente utilizzati in diversi ambiti, come l’interpretazione del linguaggio dei segni, in robotica per l’assistenza sociale, per indica- re direzioni attraverso il puntamento, nel riconoscimento di gesti facciali [1], interfacce naturali per computer (valida alternativa a mouse e tastiera), ampliare e rendere unica l’esperienza dei videogiochi (ad esempio Microsoft 1 Introduzione Kinect© e Nintendo Wii©), nell’affective computing1 . Mostre pubbliche e musei non fanno eccezione, assumendo un ruolo cen- trale nel coadiuvare una tecnologia prettamente volta all’intrattenimento con la cultura (e l’istruzione). In questo scenario, un sistema HCI deve cercare di coinvolgere un pubblico molto eterogeneo, composto, anche, da chi non ha a che fare ogni giorno con interfacce di questo tipo (o semplicemente con un computer), ma curioso e desideroso di beneficiare del sistema. Inoltre, si deve tenere conto che un ambiente museale presenta dei requisiti e alcune caratteristiche distintive che non possono essere ignorati. La tecnologia immersa in un contesto tale deve rispettare determinati vincoli, come: - non può essere invasiva; - deve essere coinvolgente, senza mettere in secondo piano gli artefatti; - deve essere flessibile; - richiedere il minor uso (o meglio, la totale assenza) di dispositivi hardware. In questa tesi, considerando le premesse sopracitate, si presenta una sistema che può essere utilizzato efficacemente in un contesto museale, o in un ambiente che richieda soluzioni non invasive. Il metodo proposto, utilizzando solo una webcam e nessun altro dispositivo personalizzato o specifico, permette di implementare i servizi di: (a) rilevamento e (b) monitoraggio dei visitatori, (c) riconoscimento delle azioni.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Questa tesi si inserisce nel filone di ricerca dell'elaborazione di dati 3D, e in particolare nella 3D Object Recognition, e delinea in primo luogo una panoramica sulle principali rappresentazioni strutturate di dati 3D, le quali rappresentano una prerogativa necessaria per implementare in modo efficiente algoritmi di processing di dati 3D, per poi presentare un nuovo algoritmo di 3D Keypoint Detection che è stato sviluppato e proposto dal Computer Vision Laboratory dell'Università di Bologna presso il quale ho effettuato la mia attività di tesi.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aviation security strongly depends on screeners' performance in the detection of threat objects in x-ray images of passenger bags. We examined for the first time the effects of stress and stress-induced cortisol increases on detection performance of hidden weapons in an x-ray baggage screening task. We randomly assigned 48 participants either to a stress or a nonstress group. The stress group was exposed to a standardized psychosocial stress test (TSST). Before and after stress/nonstress, participants had to detect threat objects in a computer-based object recognition test (X-ray ORT). We repeatedly measured salivary cortisol and X-ray ORT performance before and after stress/nonstress. Cortisol increases in reaction to psychosocial stress induction but not to nonstress independently impaired x-ray detection performance. Our results suggest that stress-induced cortisol increases at peak reactivity impair x-ray screening performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Despite that Critical Infrastructures (CIs) security and surveillance are a growing concern for many countries and companies, Multi Robot Systems (MRSs) have not been yet broadly used in this type of facilities. This dissertation presents a novel study of the challenges arisen by the implementation of this type of systems and proposes solutions to specific problems. First, a comprehensive analysis of different types of CIs has been carried out, emphasizing the influence of the different characteristics of the facilities in the design of a security and surveillance MRS. One of the most important needs for the surveillance of a CI is the detection of intruders. From a technical point of view this problem can be abstracted as equivalent to the Detection and Tracking of Mobile Objects (DATMO). This dissertation proposes algorithms to solve this specific problem in a CI environment. Using 3D range images of the environment as input data, two detection algorithms for ground robots have been developed. These detection algorithms provide a list of moving objects in the robot detection area. Direct image differentiation and computer vision techniques are used when the robot is static. Alternatively, multi-layer ground reconstructions are compared to detect the dynamic objects when the robot is moving. Since CIs usually spread over large areas, it is very useful to incorporate aerial vehicles in the surveillance MRS. Therefore, a moving object detection algorithm for aerial vehicles has been also developed. This algorithm compares the real optical flow obtained from a down-face oriented camera with an artificial optical flow computed using a RANSAC based homography matrix. Two tracking algorithms have been developed to follow the moving objects trajectories. These algorithms can efficiently handle occlusions and crossings, as well as exchange information among robots. The multirobot tracking can be applied to any type of communication structure: centralized, decentralized or a combination of both. Even more, the developed tracking algorithms are independent of the detection algorithms and could be potentially used with other detection procedures or even with static sensors, such as cameras. In addition, using the 3D point clouds available to the robots, a relative localization algorithm has been developed to improve the position estimation of a given robot with observations from other robots. All the developed algorithms have been extensively tested in different simulated CIs using the Webots robotics simulator. Furthermore, the algorithms have also been validated with real robots operating in real scenarios. In conclusion, this dissertation presents a multirobot approach to Critical Infrastructure Surveillance, mainly focusing on Detecting and Tracking Dynamic Objects.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El presente proyecto trata sobre uno de los campos más problemáticos de la inteligencia artificial, el reconocimiento facial. Algo tan sencillo para las personas como es reconocer una cara conocida se traduce en complejos algoritmos y miles de datos procesados en cuestión de segundos. El proyecto comienza con un estudio del estado del arte de las diversas técnicas de reconocimiento facial, desde las más utilizadas y probadas como el PCA y el LDA, hasta técnicas experimentales que utilizan imágenes térmicas en lugar de las clásicas con luz visible. A continuación, se ha implementado una aplicación en lenguaje C++ que sea capaz de reconocer a personas almacenadas en su base de datos leyendo directamente imágenes desde una webcam. Para realizar la aplicación, se ha utilizado una de las librerías más extendidas en cuanto a procesado de imágenes y visión artificial, OpenCV. Como IDE se ha escogido Visual Studio 2010, que cuenta con una versión gratuita para estudiantes. La técnica escogida para implementar la aplicación es la del PCA ya que es una técnica básica en el reconocimiento facial, y además sirve de base para soluciones mucho más complejas. Se han estudiado los fundamentos matemáticos de la técnica para entender cómo procesa la información y en qué se datos se basa para realizar el reconocimiento. Por último, se ha implementado un algoritmo de testeo para poder conocer la fiabilidad de la aplicación con varias bases de datos de imágenes faciales. De esta forma, se puede comprobar los puntos fuertes y débiles del PCA. ABSTRACT. This project deals with one of the most problematic areas of artificial intelligence, facial recognition. Something so simple for human as to recognize a familiar face becomes into complex algorithms and thousands of data processed in seconds. The project begins with a study of the state of the art of various face recognition techniques, from the most used and tested as PCA and LDA, to experimental techniques that use thermal images instead of the classic visible light images. Next, an application has been implemented in C + + language that is able to recognize people stored in a database reading images directly from a webcam. To make the application, it has used one of the most outstretched libraries in terms of image processing and computer vision, OpenCV. Visual Studio 2010 has been chosen as the IDE, which has a free student version. The technique chosen to implement the software is the PCA because it is a basic technique in face recognition, and also provides a basis for more complex solutions. The mathematical foundations of the technique have been studied to understand how it processes the information and which data are used to do the recognition. Finally, an algorithm for testing has been implemented to know the reliability of the application with multiple databases of facial images. In this way, the strengths and weaknesses of the PCA can be checked.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the recent years, the computer vision community has shown great interest on depth-based applications thanks to the performance and flexibility of the new generation of RGB-D imagery. In this paper, we present an efficient background subtraction algorithm based on the fusion of multiple region-based classifiers that processes depth and color data provided by RGB-D cameras. Foreground objects are detected by combining a region-based foreground prediction (based on depth data) with different background models (based on a Mixture of Gaussian algorithm) providing color and depth descriptions of the scene at pixel and region level. The information given by these modules is fused in a mixture of experts fashion to improve the foreground detection accuracy. The main contributions of the paper are the region-based models of both background and foreground, built from the depth and color data. The obtained results using different database sequences demonstrate that the proposed approach leads to a higher detection accuracy with respect to existing state-of-the-art techniques.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

La familia de algoritmos de Boosting son un tipo de técnicas de clasificación y regresión que han demostrado ser muy eficaces en problemas de Visión Computacional. Tal es el caso de los problemas de detección, de seguimiento o bien de reconocimiento de caras, personas, objetos deformables y acciones. El primer y más popular algoritmo de Boosting, AdaBoost, fue concebido para problemas binarios. Desde entonces, muchas han sido las propuestas que han aparecido con objeto de trasladarlo a otros dominios más generales: multiclase, multilabel, con costes, etc. Nuestro interés se centra en extender AdaBoost al terreno de la clasificación multiclase, considerándolo como un primer paso para posteriores ampliaciones. En la presente tesis proponemos dos algoritmos de Boosting para problemas multiclase basados en nuevas derivaciones del concepto margen. El primero de ellos, PIBoost, está concebido para abordar el problema descomponiéndolo en subproblemas binarios. Por un lado, usamos una codificación vectorial para representar etiquetas y, por otro, utilizamos la función de pérdida exponencial multiclase para evaluar las respuestas. Esta codificación produce un conjunto de valores margen que conllevan un rango de penalizaciones en caso de fallo y recompensas en caso de acierto. La optimización iterativa del modelo genera un proceso de Boosting asimétrico cuyos costes dependen del número de etiquetas separadas por cada clasificador débil. De este modo nuestro algoritmo de Boosting tiene en cuenta el desbalanceo debido a las clases a la hora de construir el clasificador. El resultado es un método bien fundamentado que extiende de manera canónica al AdaBoost original. El segundo algoritmo propuesto, BAdaCost, está concebido para problemas multiclase dotados de una matriz de costes. Motivados por los escasos trabajos dedicados a generalizar AdaBoost al terreno multiclase con costes, hemos propuesto un nuevo concepto de margen que, a su vez, permite derivar una función de pérdida adecuada para evaluar costes. Consideramos nuestro algoritmo como la extensión más canónica de AdaBoost para este tipo de problemas, ya que generaliza a los algoritmos SAMME, Cost-Sensitive AdaBoost y PIBoost. Por otro lado, sugerimos un simple procedimiento para calcular matrices de coste adecuadas para mejorar el rendimiento de Boosting a la hora de abordar problemas estándar y problemas con datos desbalanceados. Una serie de experimentos nos sirven para demostrar la efectividad de ambos métodos frente a otros conocidos algoritmos de Boosting multiclase en sus respectivas áreas. En dichos experimentos se usan bases de datos de referencia en el área de Machine Learning, en primer lugar para minimizar errores y en segundo lugar para minimizar costes. Además, hemos podido aplicar BAdaCost con éxito a un proceso de segmentación, un caso particular de problema con datos desbalanceados. Concluimos justificando el horizonte de futuro que encierra el marco de trabajo que presentamos, tanto por su aplicabilidad como por su flexibilidad teórica. Abstract The family of Boosting algorithms represents a type of classification and regression approach that has shown to be very effective in Computer Vision problems. Such is the case of detection, tracking and recognition of faces, people, deformable objects and actions. The first and most popular algorithm, AdaBoost, was introduced in the context of binary classification. Since then, many works have been proposed to extend it to the more general multi-class, multi-label, costsensitive, etc... domains. Our interest is centered in extending AdaBoost to two problems in the multi-class field, considering it a first step for upcoming generalizations. In this dissertation we propose two Boosting algorithms for multi-class classification based on new generalizations of the concept of margin. The first of them, PIBoost, is conceived to tackle the multi-class problem by solving many binary sub-problems. We use a vectorial codification to represent class labels and a multi-class exponential loss function to evaluate classifier responses. This representation produces a set of margin values that provide a range of penalties for failures and rewards for successes. The stagewise optimization of this model introduces an asymmetric Boosting procedure whose costs depend on the number of classes separated by each weak-learner. In this way the Boosting procedure takes into account class imbalances when building the ensemble. The resulting algorithm is a well grounded method that canonically extends the original AdaBoost. The second algorithm proposed, BAdaCost, is conceived for multi-class problems endowed with a cost matrix. Motivated by the few cost-sensitive extensions of AdaBoost to the multi-class field, we propose a new margin that, in turn, yields a new loss function appropriate for evaluating costs. Since BAdaCost generalizes SAMME, Cost-Sensitive AdaBoost and PIBoost algorithms, we consider our algorithm as a canonical extension of AdaBoost to this kind of problems. We additionally suggest a simple procedure to compute cost matrices that improve the performance of Boosting in standard and unbalanced problems. A set of experiments is carried out to demonstrate the effectiveness of both methods against other relevant Boosting algorithms in their respective areas. In the experiments we resort to benchmark data sets used in the Machine Learning community, firstly for minimizing classification errors and secondly for minimizing costs. In addition, we successfully applied BAdaCost to a segmentation task, a particular problem in presence of imbalanced data. We conclude the thesis justifying the horizon of future improvements encompassed in our framework, due to its applicability and theoretical flexibility.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we propose two Bayesian methods for detecting and grouping junctions. Our junction detection method evolves from the Kona approach, and it is based on a competitive greedy procedure inspired in the region competition method. Then, junction grouping is accomplished by finding connecting paths between pairs of junctions. Path searching is performed by applying a Bayesian A* algorithm that has been recently proposed. Both methods are efficient and robust, and they are tested with synthetic and real images.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Because faces and bodies share some abstract perceptual features, we hypothesised that similar recognition processes might be used for both. We investigated whether similar caricature effects to those found in facial identity and expression recognition could be found in the recognition of individual bodies and socially meaningful body positions. Participants were trained to name four body positions (anger, fear, disgust, sadness) and four individuals (in a neutral position). We then tested their recognition of extremely caricatured, moderately caricatured, anticaricatured, and undistorted images of each stimulus. Consistent with caricature effects found in face recognition, moderately caricatured representations of individuals' bodies were recognised more accurately than undistorted and extremely caricatured representations. No significant difference was found between participants' recognition of extremely caricatured, moderately caricatured, or undistorted body position line-drawings. AU anti-caricatured representations were named significandy less accurately than the veridical stimuli. Similar mental representations may be used for both bodies and faces.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Three experiments assessed the development of children's part and configural (part-relational) processing in object recognition during adolescence. In total, 312 school children aged 7-16 years and 80 adults were tested in 3-alternative forced choice (3-AFC) tasks. They judged the correct appearance of upright and inverted presented familiar animals, artifacts, and newly learned multipart objects, which had been manipulated either in terms of individual parts or part relations. Manipulation of part relations was constrained to either metric (animals, artifacts, and multipart objects) or categorical (multipart objects only) changes. For animals and artifacts, even the youngest children were close to adult levels for the correct recognition of an individual part change. By contrast, it was not until 11-12 years of age that they achieved similar levels of performance with regard to altered metric part relations. For the newly learned multipart objects, performance was equivalent throughout the tested age range for upright presented stimuli in the case of categorical part-specific and part-relational changes. In the case of metric manipulations, the results confirmed the data pattern observed for animals and artifacts. Together, the results provide converging evidence, with studies of face recognition, for a surprisingly late consolidation of configural-metric relative to part-based object recognition.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation develops an image processing framework with unique feature extraction and similarity measurements for human face recognition in the thermal mid-wave infrared portion of the electromagnetic spectrum. The goals of this research is to design specialized algorithms that would extract facial vasculature information, create a thermal facial signature and identify the individual. The objective is to use such findings in support of a biometrics system for human identification with a high degree of accuracy and a high degree of reliability. This last assertion is due to the minimal to no risk for potential alteration of the intrinsic physiological characteristics seen through thermal infrared imaging. The proposed thermal facial signature recognition is fully integrated and consolidates the main and critical steps of feature extraction, registration, matching through similarity measures, and validation through testing our algorithm on a database, referred to as C-X1, provided by the Computer Vision Research Laboratory at the University of Notre Dame. Feature extraction was accomplished by first registering the infrared images to a reference image using the functional MRI of the Brain’s (FMRIB’s) Linear Image Registration Tool (FLIRT) modified to suit thermal infrared images. This was followed by segmentation of the facial region using an advanced localized contouring algorithm applied on anisotropically diffused thermal images. Thermal feature extraction from facial images was attained by performing morphological operations such as opening and top-hat segmentation to yield thermal signatures for each subject. Four thermal images taken over a period of six months were used to generate thermal signatures and a thermal template for each subject, the thermal template contains only the most prevalent and consistent features. Finally a similarity measure technique was used to match signatures to templates and the Principal Component Analysis (PCA) was used to validate the results of the matching process. Thirteen subjects were used for testing the developed technique on an in-house thermal imaging system. The matching using an Euclidean-based similarity measure showed 88% accuracy in the case of skeletonized signatures and templates, we obtained 90% accuracy for anisotropically diffused signatures and templates. We also employed the Manhattan-based similarity measure and obtained an accuracy of 90.39% for skeletonized and diffused templates and signatures. It was found that an average 18.9% improvement in the similarity measure was obtained when using diffused templates. The Euclidean- and Manhattan-based similarity measure was also applied to skeletonized signatures and templates of 25 subjects in the C-X1 database. The highly accurate results obtained in the matching process along with the generalized design process clearly demonstrate the ability of the thermal infrared system to be used on other thermal imaging based systems and related databases. A novel user-initialization registration of thermal facial images has been successfully implemented. Furthermore, the novel approach at developing a thermal signature template using four images taken at various times ensured that unforeseen changes in the vasculature did not affect the biometric matching process as it relied on consistent thermal features.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Hardware/software (HW/SW) cosimulation integrates software simulation and hardware simulation simultaneously. Usually, HW/SW co-simulation platform is used to ease debugging and verification for very large-scale integration (VLSI) design. To accelerate the computation of the gesture recognition technique, an HW/SW implementation using field programmable gate array (FPGA) technology is presented in this paper. The major contributions of this work are: (1) a novel design of memory controller in the Verilog Hardware Description Language (Verilog HDL) to reduce memory consumption and load on the processor. (2) The testing part of the neural network algorithm is being hardwired to improve the speed and performance. The American Sign Language gesture recognition is chosen to verify the performance of the approach. Several experiments were carried out on four databases of the gestures (alphabet signs A to Z). (3) The major benefit of this design is that it takes only few milliseconds to recognize the hand gesture which makes it computationally more efficient.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Amyotrophic Lateral Sclerosis (ALS) is a neurodegenerative disease characterized by progressive muscle weakness that leads the patient to death, usually due to respiratory complications. Thus, as the disease progresses the patient will require noninvasive ventilation (NIV) and constant monitoring. This paper presents a distributed architecture for homecare monitoring of nocturnal NIV in patients with ALS. The implementation of this architecture used single board computers and mobile devices placed in patient’s homes, to display alert messages for caregivers and a web server for remote monitoring by the healthcare staff. The architecture used a software based on fuzzy logic and computer vision to capture data from a mechanical ventilator screen and generate alert messages with instructions for caregivers. The monitoring was performed on 29 patients for 7 con-tinuous hours daily during 5 days generating a total of 126000 samples for each variable monitored at a sampling rate of one sample per second. The system was evaluated regarding the rate of hits for character recognition and its correction through an algorithm for the detection and correction of errors. Furthermore, a healthcare team evaluated regarding the time intervals at which the alert messages were generated and the correctness of such messages. Thus, the system showed an average hit rate of 98.72%, and in the worst case 98.39%. As for the message to be generated, the system also agreed 100% to the overall assessment, and there was disagreement in only 2 cases with one of the physician evaluators.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Amyotrophic Lateral Sclerosis (ALS) is a neurodegenerative disease characterized by progressive muscle weakness that leads the patient to death, usually due to respiratory complications. Thus, as the disease progresses the patient will require noninvasive ventilation (NIV) and constant monitoring. This paper presents a distributed architecture for homecare monitoring of nocturnal NIV in patients with ALS. The implementation of this architecture used single board computers and mobile devices placed in patient’s homes, to display alert messages for caregivers and a web server for remote monitoring by the healthcare staff. The architecture used a software based on fuzzy logic and computer vision to capture data from a mechanical ventilator screen and generate alert messages with instructions for caregivers. The monitoring was performed on 29 patients for 7 con-tinuous hours daily during 5 days generating a total of 126000 samples for each variable monitored at a sampling rate of one sample per second. The system was evaluated regarding the rate of hits for character recognition and its correction through an algorithm for the detection and correction of errors. Furthermore, a healthcare team evaluated regarding the time intervals at which the alert messages were generated and the correctness of such messages. Thus, the system showed an average hit rate of 98.72%, and in the worst case 98.39%. As for the message to be generated, the system also agreed 100% to the overall assessment, and there was disagreement in only 2 cases with one of the physician evaluators.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nell'elaborato viene introdotto l'ambito della Computer Vision e come l'algoritmo SIFT si inserisce nel suo panorama. Viene inoltre descritto SIFT stesso, le varie fasi di cui si compone e un'applicazione al problema dell'object recognition. Infine viene presentata un'implementazione di SIFT in linguaggio Python creata per ottenere un'applicazione didattica interattiva e vengono mostrati esempi di questa applicazione.