905 resultados para face recognition algorithms
Resumo:
Voluntary control of information processing is crucial to allocate resources and prioritize the processes that are most important under a given situation; the algorithms underlying such control, however, are often not clear. We investigated possible algorithms of control for the performance of the majority function, in which participants searched for and identified one of two alternative categories (left or right pointing arrows) as composing the majority in each stimulus set. We manipulated the amount (set size of 1, 3, and 5) and content (ratio of left and right pointing arrows within a set) of the inputs to test competing hypotheses regarding mental operations for information processing. Using a novel measure based on computational load, we found that reaction time was best predicted by a grouping search algorithm as compared to alternative algorithms (i.e., exhaustive or self-terminating search). The grouping search algorithm involves sampling and resampling of the inputs before a decision is reached. These findings highlight the importance of investigating the implications of voluntary control via algorithms of mental operations.
Resumo:
Dynamic systems, especially in real-life applications, are often determined by inter-/intra-variability, uncertainties and time-varying components. Physiological systems are probably the most representative example in which population variability, vital signal measurement noise and uncertain dynamics render their explicit representation and optimization a rather difficult task. Systems characterized by such challenges often require the use of adaptive algorithmic solutions able to perform an iterative structural and/or parametrical update process towards optimized behavior. Adaptive optimization presents the advantages of (i) individualization through learning of basic system characteristics, (ii) ability to follow time-varying dynamics and (iii) low computational cost. In this chapter, the use of online adaptive algorithms is investigated in two basic research areas related to diabetes management: (i) real-time glucose regulation and (ii) real-time prediction of hypo-/hyperglycemia. The applicability of these methods is illustrated through the design and development of an adaptive glucose control algorithm based on reinforcement learning and optimal control and an adaptive, personalized early-warning system for the recognition and alarm generation against hypo- and hyperglycemic events.
Resumo:
Background: Patients presenting to the emergency department (ED) currently face inacceptable delays in initial treatment, and long, costly hospital stays due to suboptimal initial triage and site-of-care decisions. Accurate ED triage should focus not only on initial treatment priority, but also on prediction of medical risk and nursing needs to improve site-of-care decisions and to simplify early discharge management. Different triage scores have been proposed, such as the Manchester triage system (MTS). Yet, these scores focus only on treatment priority, have suboptimal performance and lack validation in the Swiss health care system. Because the MTS will be introduced into clinical routine at the Kantonsspital Aarau, we propose a large prospective cohort study to optimize initial patient triage. Specifically, the aim of this trial is to derive a three-part triage algorithm to better predict (a) treatment priority; (b) medical risk and thus need for in-hospital treatment; (c) post-acute care needs of patients at the most proximal time point of ED admission. Methods/design: Prospective, observational, multicenter, multi-national cohort study. We will include all consecutive medical patients seeking ED care into this observational registry. There will be no exclusions except for non-adult and non-medical patients. Vital signs will be recorded and left over blood samples will be stored for later batch analysis of blood markers. Upon ED admission, the post-acute care discharge score (PACD) will be recorded. Attending ED physicians will adjudicate triage priority based on all available results at the time of ED discharge to the medical ward. Patients will be reassessed daily during the hospital course for medical stability and readiness for discharge from the nurses and if involved social workers perspective. To assess outcomes, data from electronic medical records will be used and all patients will be contacted 30 days after hospital admission to assess vital and functional status, re-hospitalization, satisfaction with care and quality of life measures. We aim to include between 5000 and 7000 patients over one year of recruitment to derive the three-part triage algorithm. The respective main endpoints were defined as (a) initial triage priority (high vs. low priority) adjudicated by the attending ED physician at ED discharge, (b) adverse 30 day outcome (death or intensive care unit admission) within 30 days following ED admission to assess patients risk and thus need for in-hospital treatment and (c) post acute care needs after hospital discharge, defined as transfer of patients to a post-acute care institution, for early recognition and planning of post-acute care needs. Other outcomes are time to first physician contact, time to initiation of adequate medical therapy, time to social worker involvement, length of hospital stay, reasons fordischarge delays, patient’s satisfaction with care, overall hospital costs and patients care needs after returning home. Discussion: Using a reliable initial triage system for estimating initial treatment priority, need for in-hospital treatment and post-acute care needs is an innovative and persuasive approach for a more targeted and efficient management of medical patients in the ED. The proposed interdisciplinary , multi-national project has unprecedented potential to improve initial triage decisions and optimize resource allocation to the sickest patients from admission to discharge. The algorithms derived in this study will be compared in a later randomized controlled trial against a usual care control group in terms of resource use, length of hospital stay, overall costs and patient’s outcomes in terms of mortality, re-hospitalization, quality of life and satisfaction with care.
Resumo:
Smart homes for the aging population have recently started attracting the attention of the research community. The "health state" of smart homes is comprised of many different levels; starting with the physical health of citizens, it also includes longer-term health norms and outcomes, as well as the arena of positive behavior changes. One of the problems of interest is to monitor the activities of daily living (ADL) of the elderly, aiming at their protection and well-being. For this purpose, we installed passive infrared (PIR) sensors to detect motion in a specific area inside a smart apartment and used them to collect a set of ADL. In a novel approach, we describe a technology that allows the ground truth collected in one smart home to train activity recognition systems for other smart homes. We asked the users to label all instances of all ADL only once and subsequently applied data mining techniques to cluster in-home sensor firings. Each cluster would therefore represent the instances of the same activity. Once the clusters were associated to their corresponding activities, our system was able to recognize future activities. To improve the activity recognition accuracy, our system preprocessed raw sensor data by identifying overlapping activities. To evaluate the recognition performance from a 200-day dataset, we implemented three different active learning classification algorithms and compared their performance: naive Bayesian (NB), support vector machine (SVM) and random forest (RF). Based on our results, the RF classifier recognized activities with an average specificity of 96.53%, a sensitivity of 68.49%, a precision of 74.41% and an F-measure of 71.33%, outperforming both the NB and SVM classifiers. Further clustering markedly improved the results of the RF classifier. An activity recognition system based on PIR sensors in conjunction with a clustering classification approach was able to detect ADL from datasets collected from different homes. Thus, our PIR-based smart home technology could improve care and provide valuable information to better understand the functioning of our societies, as well as to inform both individual and collective action in a smart city scenario.
Resumo:
BACKGROUND Lung clearance index (LCI), a marker of ventilation inhomogeneity, is elevated early in children with cystic fibrosis (CF). However, in infants with CF, LCI values are found to be normal, although structural lung abnormalities are often detectable. We hypothesized that this discrepancy is due to inadequate algorithms of the available software package. AIM Our aim was to challenge the validity of these software algorithms. METHODS We compared multiple breath washout (MBW) results of current software algorithms (automatic modus) to refined algorithms (manual modus) in 17 asymptomatic infants with CF, and 24 matched healthy term-born infants. The main difference between these two analysis methods lies in the calculation of the molar mass differences that the system uses to define the completion of the measurement. RESULTS In infants with CF the refined manual modus revealed clearly elevated LCI above 9 in 8 out of 35 measurements (23%), all showing LCI values below 8.3 using the automatic modus (paired t-test comparing the means, P < 0.001). Healthy infants showed normal LCI values using both analysis methods (n = 47, paired t-test, P = 0.79). The most relevant reason for false normal LCI values in infants with CF using the automatic modus was the incorrect recognition of the end-of-test too early during the washout. CONCLUSION We recommend the use of the manual modus for the analysis of MBW outcomes in infants in order to obtain more accurate results. This will allow appropriate use of infant lung function results for clinical and scientific purposes.
Resumo:
We present tools for rapid and quantitative detection of sediment lamination. The BMPix tool extracts color and gray-scale curves from images at pixel resolution. The PEAK tool uses the gray-scale curve and performs, for the first time, fully automated counting of laminae based on three methods. The maximum count algorithm counts every bright peak of a couplet of two laminae (annual resolution) in a smoothed curve. The zero-crossing algorithm counts every positive and negative halfway-passage of the curve through a wide moving average, separating the record into bright and dark intervals (seasonal resolution). The same is true for the frequency truncation method, which uses Fourier transformation to decompose the curve into its frequency components before counting positive and negative passages. We applied the new methods successfully to tree rings, to well-dated and already manually counted marine varves from Saanich Inlet, and to marine laminae from the Antarctic continental margin. In combination with AMS14C dating, we found convincing evidence that laminations in Weddell Sea sites represent varves, deposited continuously over several millennia during the last glacial maximum. The new tools offer several advantages over previous methods. The counting procedures are based on a moving average generated from gray-scale curves instead of manual counting. Hence, results are highly objective and rely on reproducible mathematical criteria. Also, the PEAK tool measures the thickness of each year or season. Since all information required is displayed graphically, interactive optimization of the counting algorithms can be achieved quickly and conveniently.
Resumo:
Actualmente la detección del rostro humano es un tema difícil debido a varios parámetros implicados. Llega a ser de interés cada vez mayor en diversos campos de aplicaciones como en la identificación personal, la interface hombre-máquina, etc. La mayoría de las imágenes del rostro contienen un fondo que se debe eliminar/discriminar para poder así detectar el rostro humano. Así, este proyecto trata el diseño y la implementación de un sistema de detección facial humana, como el primer paso en el proceso, dejando abierto el camino, para en un posible futuro, ampliar este proyecto al siguiente paso, que sería, el Reconocimiento Facial, tema que no trataremos aquí. En la literatura científica, uno de los trabajos más importantes de detección de rostros en tiempo real es el algoritmo de Viola and Jones, que ha sido tras su uso y con las librerías de Open CV, el algoritmo elegido para el desarrollo de este proyecto. A continuación explicaré un breve resumen sobre el funcionamiento de mi aplicación. Mi aplicación puede capturar video en tiempo real y reconocer el rostro que la Webcam captura frente al resto de objetos que se pueden visualizar a través de ella. Para saber que el rostro es detectado, éste es recuadrado en su totalidad y seguido si este mueve. A su vez, si el usuario lo desea, puede guardar la imagen que la cámara esté mostrando, pudiéndola almacenar en cualquier directorio del PC. Además, incluí la opción de poder detectar el rostro humano sobre una imagen fija, cualquiera que tengamos guardada en nuestro PC, siendo mostradas el número de caras detectadas y pudiendo visualizarlas sucesivamente cuantas veces queramos. Para todo ello como bien he mencionado antes, el algoritmo usado para la detección facial es el de Viola and Jones. Este algoritmo se basa en el escaneo de toda la superficie de la imagen en busca del rostro humano, para ello, primero la imagen se transforma a escala de grises y luego se analiza dicha imagen, mostrando como resultado el rostro encuadrado. ABSTRACT Currently the detection of human face is a difficult issue due to various parameters involved. Becomes of increasing interest in various fields of applications such as personal identification, the man-machine interface, etc. Most of the face images contain a fund to be removed / discriminate in order to detect the human face. Thus, this project is the design and implementation of a human face detection system, as the first step in the process, leaving the way open for a possible future, extend this project to the next step would be, Facial Recognition , a topic not covered here. In the literature, one of the most important face detection in real time is the algorithm of Viola and Jones, who has been after use with Open CV libraries, the algorithm chosen for the development of this project. I will explain a brief summary of the performance of my application. My application can capture video in real time and recognize the face that the Webcam Capture compared to other objects that can be viewed through it. To know that the face is detected, it is fully boxed and followed if this move. In turn, if the user may want to save the image that the camera is showing, could store in any directory on your PC. I also included the option to detect the human face on a still image, whatever we have stored in your PC, being shown the number of faces detected and can view them on more times. For all as well I mentioned before, the algorithm used for face detection is that of Viola and Jones. This algorithm is based on scanning the entire surface of the image for the human face, for this, first the image is converted to gray-scale and then analyzed the image, showing results in the face framed.
Resumo:
Performing activity recognition using the information provided by the different sensors embedded in a smartphone face limitations due to the capabilities of those devices when the computations are carried out in the terminal. In this work a fuzzy inference module is implemented in order to decide which classifier is the most appropriate to be used at a specific moment regarding the application requirements and the device context characterized by its battery level, available memory and CPU load. The set of classifiers that is considered is composed of Decision Tables and Trees that have been trained using different number of sensors and features. In addition, some classifiers perform activity recognition regardless of the on-body device position and others rely on the previous recognition of that position to use a classifier that is trained with measurements gathered with the mobile placed on that specific position. The modules implemented show that an evaluation of the classifiers allows sorting them so the fuzzy inference module can choose periodically the one that best suits the device context and application requirements.
Resumo:
This paper describes a low complexity strategy for detecting and recognizing text signs automatically. Traditional approaches use large image algorithms for detecting the text sign, followed by the application of an Optical Character Recognition (OCR) algorithm in the previously identified areas. This paper proposes a new architecture that applies the OCR to a whole lightly treated image and then carries out the text detection process of the OCR output. The strategy presented in this paper significantly reduces the processing time required for text localization in an image, while guaranteeing a high recognition rate. This strategy will facilitate the incorporation of video processing-based applications into the automatic detection of text sign similar to that of a smartphone. These applications will increase the autonomy of visually impaired people in their daily life.
Resumo:
PAMELA (Phased Array Monitoring for Enhanced Life Assessment) SHMTM System is an integrated embedded ultrasonic guided waves based system consisting of several electronic devices and one system manager controller. The data collected by all PAMELA devices in the system must be transmitted to the controller, who will be responsible for carrying out the advanced signal processing to obtain SHM maps. PAMELA devices consist of hardware based on a Virtex 5 FPGA with a PowerPC 440 running an embedded Linux distribution. Therefore, PAMELA devices, in addition to the capability of performing tests and transmitting the collected data to the controller, have the capability of perform local data processing or pre-processing (reduction, normalization, pattern recognition, feature extraction, etc.). Local data processing decreases the data traffic over the network and allows CPU load of the external computer to be reduced. Even it is possible that PAMELA devices are running autonomously performing scheduled tests, and only communicates with the controller in case of detection of structural damages or when programmed. Each PAMELA device integrates a software management application (SMA) that allows to the developer downloading his own algorithm code and adding the new data processing algorithm to the device. The development of the SMA is done in a virtual machine with an Ubuntu Linux distribution including all necessary software tools to perform the entire cycle of development. Eclipse IDE (Integrated Development Environment) is used to develop the SMA project and to write the code of each data processing algorithm. This paper presents the developed software architecture and describes the necessary steps to add new data processing algorithms to SMA in order to increase the processing capabilities of PAMELA devices.An example of basic damage index estimation using delay and sum algorithm is provided.
Resumo:
A new technology is being proposed as a solution to the problem of unintentional facial detection and recognition in pictures in which the individuals appearing want to express their privacy preferences, through the use of different tags. The existing methods for face de-identification were mostly ad hoc solutions that only provided an absolute binary solution in a privacy context such as pixelation, or a bar mask. As the number and users of social networks are increasing, our preferences regarding our privacy may become more complex, leaving these absolute binary solutions as something obsolete. The proposed technology overcomes this problem by embedding information in a tag which will be placed close to the face without being disruptive. Through a decoding method the tag will provide the preferences that will be applied to the images in further stages.
Resumo:
La familia de algoritmos de Boosting son un tipo de técnicas de clasificación y regresión que han demostrado ser muy eficaces en problemas de Visión Computacional. Tal es el caso de los problemas de detección, de seguimiento o bien de reconocimiento de caras, personas, objetos deformables y acciones. El primer y más popular algoritmo de Boosting, AdaBoost, fue concebido para problemas binarios. Desde entonces, muchas han sido las propuestas que han aparecido con objeto de trasladarlo a otros dominios más generales: multiclase, multilabel, con costes, etc. Nuestro interés se centra en extender AdaBoost al terreno de la clasificación multiclase, considerándolo como un primer paso para posteriores ampliaciones. En la presente tesis proponemos dos algoritmos de Boosting para problemas multiclase basados en nuevas derivaciones del concepto margen. El primero de ellos, PIBoost, está concebido para abordar el problema descomponiéndolo en subproblemas binarios. Por un lado, usamos una codificación vectorial para representar etiquetas y, por otro, utilizamos la función de pérdida exponencial multiclase para evaluar las respuestas. Esta codificación produce un conjunto de valores margen que conllevan un rango de penalizaciones en caso de fallo y recompensas en caso de acierto. La optimización iterativa del modelo genera un proceso de Boosting asimétrico cuyos costes dependen del número de etiquetas separadas por cada clasificador débil. De este modo nuestro algoritmo de Boosting tiene en cuenta el desbalanceo debido a las clases a la hora de construir el clasificador. El resultado es un método bien fundamentado que extiende de manera canónica al AdaBoost original. El segundo algoritmo propuesto, BAdaCost, está concebido para problemas multiclase dotados de una matriz de costes. Motivados por los escasos trabajos dedicados a generalizar AdaBoost al terreno multiclase con costes, hemos propuesto un nuevo concepto de margen que, a su vez, permite derivar una función de pérdida adecuada para evaluar costes. Consideramos nuestro algoritmo como la extensión más canónica de AdaBoost para este tipo de problemas, ya que generaliza a los algoritmos SAMME, Cost-Sensitive AdaBoost y PIBoost. Por otro lado, sugerimos un simple procedimiento para calcular matrices de coste adecuadas para mejorar el rendimiento de Boosting a la hora de abordar problemas estándar y problemas con datos desbalanceados. Una serie de experimentos nos sirven para demostrar la efectividad de ambos métodos frente a otros conocidos algoritmos de Boosting multiclase en sus respectivas áreas. En dichos experimentos se usan bases de datos de referencia en el área de Machine Learning, en primer lugar para minimizar errores y en segundo lugar para minimizar costes. Además, hemos podido aplicar BAdaCost con éxito a un proceso de segmentación, un caso particular de problema con datos desbalanceados. Concluimos justificando el horizonte de futuro que encierra el marco de trabajo que presentamos, tanto por su aplicabilidad como por su flexibilidad teórica. Abstract The family of Boosting algorithms represents a type of classification and regression approach that has shown to be very effective in Computer Vision problems. Such is the case of detection, tracking and recognition of faces, people, deformable objects and actions. The first and most popular algorithm, AdaBoost, was introduced in the context of binary classification. Since then, many works have been proposed to extend it to the more general multi-class, multi-label, costsensitive, etc... domains. Our interest is centered in extending AdaBoost to two problems in the multi-class field, considering it a first step for upcoming generalizations. In this dissertation we propose two Boosting algorithms for multi-class classification based on new generalizations of the concept of margin. The first of them, PIBoost, is conceived to tackle the multi-class problem by solving many binary sub-problems. We use a vectorial codification to represent class labels and a multi-class exponential loss function to evaluate classifier responses. This representation produces a set of margin values that provide a range of penalties for failures and rewards for successes. The stagewise optimization of this model introduces an asymmetric Boosting procedure whose costs depend on the number of classes separated by each weak-learner. In this way the Boosting procedure takes into account class imbalances when building the ensemble. The resulting algorithm is a well grounded method that canonically extends the original AdaBoost. The second algorithm proposed, BAdaCost, is conceived for multi-class problems endowed with a cost matrix. Motivated by the few cost-sensitive extensions of AdaBoost to the multi-class field, we propose a new margin that, in turn, yields a new loss function appropriate for evaluating costs. Since BAdaCost generalizes SAMME, Cost-Sensitive AdaBoost and PIBoost algorithms, we consider our algorithm as a canonical extension of AdaBoost to this kind of problems. We additionally suggest a simple procedure to compute cost matrices that improve the performance of Boosting in standard and unbalanced problems. A set of experiments is carried out to demonstrate the effectiveness of both methods against other relevant Boosting algorithms in their respective areas. In the experiments we resort to benchmark data sets used in the Machine Learning community, firstly for minimizing classification errors and secondly for minimizing costs. In addition, we successfully applied BAdaCost to a segmentation task, a particular problem in presence of imbalanced data. We conclude the thesis justifying the horizon of future improvements encompassed in our framework, due to its applicability and theoretical flexibility.
Resumo:
Este trabajo presenta una solución al problema del reconocimiento del género de un rostro humano a partir de una imagen. Adoptamos una aproximación que utiliza la cara completa a través de la textura de la cara normalizada y redimensionada como entrada a un clasificador Näive Bayes. Presentamos la técnica de Análisis de Componentes Principales Probabilístico Condicionado-a-la-Clase (CC-PPCA) para reducir la dimensionalidad de los vectores de características para la clasificación y asegurar la asunción de independencia para el clasificador. Esta nueva aproximación tiene la deseable propiedad de presentar un modelo paramétrico sencillo para las marginales. Además, este modelo puede estimarse con muy pocos datos. En los experimentos que hemos desarrollados mostramos que CC-PPCA obtiene un 90% de acierto en la clasificación, resultado muy similar al mejor presentado en la literatura---ABSTRACT---This paper presents a solution to the problem of recognizing the gender of a human face from an image. We adopt a holistic approach by using the cropped and normalized texture of the face as input to a Naïve Bayes classifier. First it is introduced the Class-Conditional Probabilistic Principal Component Analysis (CC-PPCA) technique to reduce the dimensionality of the classification attribute vector and enforce the independence assumption of the classifier. This new approach has the desirable property of a simple parametric model for the marginals. Moreover this model can be estimated with very few data. In the experiments conducted we show that using CCPPCA we get 90% classification accuracy, which is similar result to the best in the literature. The proposed method is very simple to train and implement.
Resumo:
El uso de técnicas para la monitorización del movimiento humano generalmente permite a los investigadores analizar la cinemática y especialmente las capacidades motoras en aquellas actividades de la vida cotidiana que persiguen un objetivo concreto como pueden ser la preparación de bebidas y comida, e incluso en tareas de aseo. Adicionalmente, la evaluación del movimiento y el comportamiento humanos en el campo de la rehabilitación cognitiva es esencial para profundizar en las dificultades que algunas personas encuentran en la ejecución de actividades diarias después de accidentes cerebro-vasculares. Estas dificultades están principalmente asociadas a la realización de pasos secuenciales y al reconocimiento del uso de herramientas y objetos. La interpretación de los datos sobre la actitud de este tipo de pacientes para reconocer y determinar el nivel de éxito en la ejecución de las acciones, y para ampliar el conocimiento en las enfermedades cerebrales, sus consecuencias y severidad, depende totalmente de los dispositivos usados para la captura de esos datos y de la calidad de los mismos. Más aún, existe una necesidad real de mejorar las técnicas actuales de rehabilitación cognitiva contribuyendo al diseño de sistemas automáticos para crear una especie de terapeuta virtual que asegure una vida más independiente de estos pacientes y reduzca la carga de trabajo de los terapeutas. Con este objetivo, el uso de sensores y dispositivos para obtener datos en tiempo real de la ejecución y estado de la tarea de rehabilitación es esencial para también contribuir al diseño y entrenamiento de futuros algoritmos que pudieran reconocer errores automáticamente para informar al paciente acerca de ellos mediante distintos tipos de pistas como pueden ser imágenes, mensajes auditivos o incluso videos. La tecnología y soluciones existentes en este campo no ofrecen una manera totalmente robusta y efectiva para obtener datos en tiempo real, por un lado, porque pueden influir en el movimiento del propio paciente en caso de las plataformas basadas en el uso de marcadores que necesitan sensores pegados en la piel; y por otro lado, debido a la complejidad o alto coste de implantación lo que hace difícil pensar en la idea de instalar un sistema en el hospital o incluso en la casa del paciente. Esta tesis presenta la investigación realizada en el campo de la monitorización del movimiento de pacientes para proporcionar un paso adelante en términos de detección, seguimiento y reconocimiento del comportamiento de manos, gestos y cara mediante una manera no invasiva la cual puede mejorar la técnicas actuales de rehabilitación cognitiva para la adquisición en tiempo real de datos sobre el comportamiento del paciente y la ejecución de la tarea. Para entender la importancia del marco de esta tesis, inicialmente se presenta un resumen de las principales enfermedades cognitivas y se introducen las consecuencias que tienen en la ejecución de tareas de la vida diaria. Más aún, se investiga sobre las metodologías actuales de rehabilitación cognitiva. Teniendo en cuenta que las manos son la principal parte del cuerpo para la ejecución de tareas manuales de la vida cotidiana, también se resumen las tecnologías existentes para la captura de movimiento de manos. Una de las principales contribuciones de esta tesis está relacionada con el diseño y evaluación de una solución no invasiva para detectar y seguir las manos durante la ejecución de tareas manuales de la vida cotidiana que a su vez involucran la manipulación de objetos. Esta solución la cual no necesita marcadores adicionales y está basada en una cámara de profundidad de bajo coste, es robusta, precisa y fácil de instalar. Otra contribución presentada se centra en el reconocimiento de gestos para detectar el agarre de objetos basado en un sensor infrarrojo de última generación, y también complementado con una cámara de profundidad. Esta nueva técnica, y también no invasiva, sincroniza ambos sensores para seguir objetos específicos además de reconocer eventos concretos relacionados con tareas de aseo. Más aún, se realiza una evaluación preliminar del reconocimiento de expresiones faciales para analizar si es adecuado para el reconocimiento del estado de ánimo durante la tarea. Por su parte, todos los componentes y algoritmos desarrollados son integrados en un prototipo simple para ser usado como plataforma de monitorización. Se realiza una evaluación técnica del funcionamiento de cada dispositivo para analizar si es adecuada para adquirir datos en tiempo real durante la ejecución de tareas cotidianas reales. Finalmente, se estudia la interacción con pacientes reales para obtener información del nivel de usabilidad del prototipo. Dicha información es esencial y útil para considerar una rehabilitación cognitiva basada en la idea de instalación del sistema en la propia casa del paciente al igual que en el hospital correspondiente. ABSTRACT The use of human motion monitoring techniques usually let researchers to analyse kinematics, especially in motor strategies for goal-oriented activities of daily living, such as the preparation of drinks and food, and even grooming tasks. Additionally, the evaluation of human movements and behaviour in the field of cognitive rehabilitation is essential to deep into the difficulties some people find in common activities after stroke. This difficulties are mainly associated with sequence actions and the recognition of tools usage. The interpretation of attitude data of this kind of patients in order to recognize and determine the level of success of the execution of actions, and to broaden the knowledge in brain diseases, consequences and severity, depends totally on the devices used for the capture of that data and the quality of it. Moreover, there is a real need of improving the current cognitive rehabilitation techniques by contributing to the design of automatic systems to create a kind of virtual therapist for the improvement of the independent life of these stroke patients and to reduce the workload of the occupational therapists currently in charge of them. For this purpose, the use of sensors and devices to obtain real time data of the execution and state of the rehabilitation task is essential to also contribute to the design and training of future smart algorithms which may recognise errors to automatically provide multimodal feedback through different types of cues such as still images, auditory messages or even videos. The technology and solutions currently adopted in the field don't offer a totally robust and effective way for obtaining real time data, on the one hand, because they may influence the patient's movement in case of marker-based platforms which need sensors attached to the skin; and on the other hand, because of the complexity or high cost of implementation, which make difficult the idea of installing a system at the hospital or even patient's home. This thesis presents the research done in the field of user monitoring to provide a step forward in terms of detection, tracking and recognition of hand movements, gestures and face via a non-invasive way which could improve current techniques for cognitive rehabilitation for real time data acquisition of patient's behaviour and execution of the task. In order to understand the importance of the scope of the thesis, initially, a summary of the main cognitive diseases that require for rehabilitation and an introduction of the consequences on the execution of daily tasks are presented. Moreover, research is done about the actual methodology to provide cognitive rehabilitation. Considering that the main body members involved in the completion of a handmade daily task are the hands, the current technologies for human hands movements capture are also highlighted. One of the main contributions of this thesis is related to the design and evaluation of a non-invasive approach to detect and track user's hands during the execution of handmade activities of daily living which involve the manipulation of objects. This approach does not need the inclusion of any additional markers. In addition, it is only based on a low-cost depth camera, it is robust, accurate and easy to install. Another contribution presented is focused on the hand gesture recognition for detecting object grasping based on a brand new infrared sensor, and also complemented with a depth camera. This new, and also non-invasive, solution which synchronizes both sensors to track specific tools as well as recognize specific events related to grooming is evaluated. Moreover, a preliminary assessment of the recognition of facial expressions is carried out to analyse if it is adequate for recognizing mood during the execution of task. Meanwhile, all the corresponding hardware and software developed are integrated in a simple prototype with the purpose of being used as a platform for monitoring the execution of the rehabilitation task. Technical evaluation of the performance of each device is carried out in order to analyze its suitability to acquire real time data during the execution of real daily tasks. Finally, a kind of healthcare evaluation is also presented to obtain feedback about the usability of the system proposed paying special attention to the interaction with real users and stroke patients. This feedback is quite useful to consider the idea of a home-based cognitive rehabilitation as well as a possible hospital installation of the prototype.
Resumo:
The aim of this Master Thesis is the analysis, design and development of a robust and reliable Human-Computer Interaction interface, based on visual hand-gesture recognition. The implementation of the required functions is oriented to the simulation of a classical hardware interaction device: the mouse, by recognizing a specific hand-gesture vocabulary in color video sequences. For this purpose, a prototype of a hand-gesture recognition system has been designed and implemented, which is composed of three stages: detection, tracking and recognition. This system is based on machine learning methods and pattern recognition techniques, which have been integrated together with other image processing approaches to get a high recognition accuracy and a low computational cost. Regarding pattern recongition techniques, several algorithms and strategies have been designed and implemented, which are applicable to color images and video sequences. The design of these algorithms has the purpose of extracting spatial and spatio-temporal features from static and dynamic hand gestures, in order to identify them in a robust and reliable way. Finally, a visual database containing the necessary vocabulary of gestures for interacting with the computer has been created.