868 resultados para Computer vision methods for sign language recognition
Resumo:
A new language recognition technique based on the application of the philosophy of the Shifted Delta Coefficients (SDC) to phone log-likelihood ratio features (PLLR) is described. The new methodology allows the incorporation of long-span phonetic information at a frame-by-frame level while dealing with the temporal length of each phone unit. The proposed features are used to train an i-vector based system and tested on the Albayzin LRE 2012 dataset. The results show a relative improvement of 33.3% in Cavg in comparison with different state-of-the-art acoustic i-vector based systems. On the other hand, the integration of parallel phone ASR systems where each one is used to generate multiple PLLR coefficients which are stacked together and then projected into a reduced dimension are also presented. Finally, the paper shows how the incorporation of state information from the phone ASR contributes to provide additional improvements and how the fusion with the other acoustic and phonotactic systems provides an important improvement of 25.8% over the system presented during the competition.
Resumo:
La evolución de los teléfonos móviles inteligentes, dotados de cámaras digitales, está provocando una creciente demanda de aplicaciones cada vez más complejas que necesitan algoritmos de visión artificial en tiempo real; puesto que el tamaño de las señales de vídeo no hace sino aumentar y en cambio el rendimiento de los procesadores de un solo núcleo se ha estancado, los nuevos algoritmos que se diseñen para visión artificial han de ser paralelos para poder ejecutarse en múltiples procesadores y ser computacionalmente escalables. Una de las clases de procesadores más interesantes en la actualidad se encuentra en las tarjetas gráficas (GPU), que son dispositivos que ofrecen un alto grado de paralelismo, un excelente rendimiento numérico y una creciente versatilidad, lo que los hace interesantes para llevar a cabo computación científica. En esta tesis se exploran dos aplicaciones de visión artificial que revisten una gran complejidad computacional y no pueden ser ejecutadas en tiempo real empleando procesadores tradicionales. En cambio, como se demuestra en esta tesis, la paralelización de las distintas subtareas y su implementación sobre una GPU arrojan los resultados deseados de ejecución con tasas de refresco interactivas. Asimismo, se propone una técnica para la evaluación rápida de funciones de complejidad arbitraria especialmente indicada para su uso en una GPU. En primer lugar se estudia la aplicación de técnicas de síntesis de imágenes virtuales a partir de únicamente dos cámaras lejanas y no paralelas—en contraste con la configuración habitual en TV 3D de cámaras cercanas y paralelas—con información de color y profundidad. Empleando filtros de mediana modificados para la elaboración de un mapa de profundidad virtual y proyecciones inversas, se comprueba que estas técnicas son adecuadas para una libre elección del punto de vista. Además, se demuestra que la codificación de la información de profundidad con respecto a un sistema de referencia global es sumamente perjudicial y debería ser evitada. Por otro lado se propone un sistema de detección de objetos móviles basado en técnicas de estimación de densidad con funciones locales. Este tipo de técnicas es muy adecuada para el modelado de escenas complejas con fondos multimodales, pero ha recibido poco uso debido a su gran complejidad computacional. El sistema propuesto, implementado en tiempo real sobre una GPU, incluye propuestas para la estimación dinámica de los anchos de banda de las funciones locales, actualización selectiva del modelo de fondo, actualización de la posición de las muestras de referencia del modelo de primer plano empleando un filtro de partículas multirregión y selección automática de regiones de interés para reducir el coste computacional. Los resultados, evaluados sobre diversas bases de datos y comparados con otros algoritmos del estado del arte, demuestran la gran versatilidad y calidad de la propuesta. Finalmente se propone un método para la aproximación de funciones arbitrarias empleando funciones continuas lineales a tramos, especialmente indicada para su implementación en una GPU mediante el uso de las unidades de filtraje de texturas, normalmente no utilizadas para cómputo numérico. La propuesta incluye un riguroso análisis matemático del error cometido en la aproximación en función del número de muestras empleadas, así como un método para la obtención de una partición cuasióptima del dominio de la función para minimizar el error. ABSTRACT The evolution of smartphones, all equipped with digital cameras, is driving a growing demand for ever more complex applications that need to rely on real-time computer vision algorithms. However, video signals are only increasing in size, whereas the performance of single-core processors has somewhat stagnated in the past few years. Consequently, new computer vision algorithms will need to be parallel to run on multiple processors and be computationally scalable. One of the most promising classes of processors nowadays can be found in graphics processing units (GPU). These are devices offering a high parallelism degree, excellent numerical performance and increasing versatility, which makes them interesting to run scientific computations. In this thesis, we explore two computer vision applications with a high computational complexity that precludes them from running in real time on traditional uniprocessors. However, we show that by parallelizing subtasks and implementing them on a GPU, both applications attain their goals of running at interactive frame rates. In addition, we propose a technique for fast evaluation of arbitrarily complex functions, specially designed for GPU implementation. First, we explore the application of depth-image–based rendering techniques to the unusual configuration of two convergent, wide baseline cameras, in contrast to the usual configuration used in 3D TV, which are narrow baseline, parallel cameras. By using a backward mapping approach with a depth inpainting scheme based on median filters, we show that these techniques are adequate for free viewpoint video applications. In addition, we show that referring depth information to a global reference system is ill-advised and should be avoided. Then, we propose a background subtraction system based on kernel density estimation techniques. These techniques are very adequate for modelling complex scenes featuring multimodal backgrounds, but have not been so popular due to their huge computational and memory complexity. The proposed system, implemented in real time on a GPU, features novel proposals for dynamic kernel bandwidth estimation for the background model, selective update of the background model, update of the position of reference samples of the foreground model using a multi-region particle filter, and automatic selection of regions of interest to reduce computational cost. The results, evaluated on several databases and compared to other state-of-the-art algorithms, demonstrate the high quality and versatility of our proposal. Finally, we propose a general method for the approximation of arbitrarily complex functions using continuous piecewise linear functions, specially formulated for GPU implementation by leveraging their texture filtering units, normally unused for numerical computation. Our proposal features a rigorous mathematical analysis of the approximation error in function of the number of samples, as well as a method to obtain a suboptimal partition of the domain of the function to minimize approximation error.
Resumo:
The current trend in the evolution of sensor systems seeks ways to provide more accuracy and resolution, while at the same time decreasing the size and power consumption. The use of Field Programmable Gate Arrays (FPGAs) provides specific reprogrammable hardware technology that can be properly exploited to obtain a reconfigurable sensor system. This adaptation capability enables the implementation of complex applications using the partial reconfigurability at a very low-power consumption. For highly demanding tasks FPGAs have been favored due to the high efficiency provided by their architectural flexibility (parallelism, on-chip memory, etc.), reconfigurability and superb performance in the development of algorithms. FPGAs have improved the performance of sensor systems and have triggered a clear increase in their use in new fields of application. A new generation of smarter, reconfigurable and lower power consumption sensors is being developed in Spain based on FPGAs. In this paper, a review of these developments is presented, describing as well the FPGA technologies employed by the different research groups and providing an overview of future research within this field.
Resumo:
In this article, we present a new framework oriented to teach Computer Vision related subjects called JavaVis. It is a computer vision library divided in three main areas: 2D package is featured for classical computer vision processing; 3D package, which includes a complete 3D geometric toolset, is used for 3D vision computing; Desktop package comprises a tool for graphic designing and testing of new algorithms. JavaVis is designed to be easy to use, both for launching and testing existing algorithms and for developing new ones.
Resumo:
Objectives: To design and validate a questionnaire to measure visual symptoms related to exposure to computers in the workplace. Study Design and Setting: Our computer vision syndrome questionnaire (CVS-Q) was based on a literature review and validated through discussion with experts and performance of a pretest, pilot test, and retest. Content validity was evaluated by occupational health, optometry, and ophthalmology experts. Rasch analysis was used in the psychometric evaluation of the questionnaire. Criterion validity was determined by calculating the sensitivity and specificity, receiver operator characteristic curve, and cutoff point. Testeretest repeatability was tested using the intraclass correlation coefficient (ICC) and concordance by Cohen’s kappa (k). Results: The CVS-Q was developed with wide consensus among experts and was well accepted by the target group. It assesses the frequency and intensity of 16 symptoms using a single rating scale (symptom severity) that fits the Rasch rating scale model well. The questionnaire has sensitivity and specificity over 70% and achieved good testeretest repeatability both for the scores obtained [ICC 5 0.802; 95% confidence interval (CI): 0.673, 0.884] and CVS classification (k 5 0.612; 95% CI: 0.384, 0.839). Conclusion: The CVS-Q has acceptable psychometric properties, making it a valid and reliable tool to control the visual health of computer workers, and can potentially be used in clinical trials and outcome research.
Resumo:
Machine vision is an important subject in computer science and engineering degrees. For laboratory experimentation, it is desirable to have a complete and easy-to-use tool. In this work we present a Java library, oriented to teaching computer vision. We have designed and built the library from the scratch with enfasis on readability and understanding rather than on efficiency. However, the library can also be used for research purposes. JavaVis is an open source Java library, oriented to the teaching of Computer Vision. It consists of a framework with several features that meet its demands. It has been designed to be easy to use: the user does not have to deal with internal structures or graphical interface, and should the student need to add a new algorithm it can be done simply enough. Once we sketch the library, we focus on the experience the student gets using this library in several computer vision courses. Our main goal is to find out whether the students understand what they are doing, that is, find out how much the library helps the student in grasping the basic concepts of computer vision. In the last four years we have conducted surveys to assess how much the students have improved their skills by using this library.
Resumo:
"References": p. 107-108.
Resumo:
Probabilistic robotics most often applied to the problem of simultaneous localisation and mapping (SLAM), requires measures of uncertainty to accompany observations of the environment. This paper describes how uncertainty can be characterised for a vision system that locates coloured landmarks in a typical laboratory environment. The paper describes a model of the uncertainty in segmentation, the internal cameral model and the mounting of the camera on the robot. It explains the implementation of the system on a laboratory robot, and provides experimental results that show the coherence of the uncertainty model.
Resumo:
This thesis deals with the challenging problem of designing systems able to perceive objects in underwater environments. In the last few decades research activities in robotics have advanced the state of art regarding intervention capabilities of autonomous systems. State of art in fields such as localization and navigation, real time perception and cognition, safe action and manipulation capabilities, applied to ground environments (both indoor and outdoor) has now reached such a readiness level that it allows high level autonomous operations. On the opposite side, the underwater environment remains a very difficult one for autonomous robots. Water influences the mechanical and electrical design of systems, interferes with sensors by limiting their capabilities, heavily impacts on data transmissions, and generally requires systems with low power consumption in order to enable reasonable mission duration. Interest in underwater applications is driven by needs of exploring and intervening in environments in which human capabilities are very limited. Nowadays, most underwater field operations are carried out by manned or remotely operated vehicles, deployed for explorations and limited intervention missions. Manned vehicles, directly on-board controlled, expose human operators to risks related to the stay in field of the mission, within a hostile environment. Remotely Operated Vehicles (ROV) currently represent the most advanced technology for underwater intervention services available on the market. These vehicles can be remotely operated for long time but they need support from an oceanographic vessel with multiple teams of highly specialized pilots. Vehicles equipped with multiple state-of-art sensors and capable to autonomously plan missions have been deployed in the last ten years and exploited as observers for underwater fauna, seabed, ship wrecks, and so on. On the other hand, underwater operations like object recovery and equipment maintenance are still challenging tasks to be conducted without human supervision since they require object perception and localization with much higher accuracy and robustness, to a degree seldom available in Autonomous Underwater Vehicles (AUV). This thesis reports the study, from design to deployment and evaluation, of a general purpose and configurable platform dedicated to stereo-vision perception in underwater environments. Several aspects related to the peculiar environment characteristics have been taken into account during all stages of system design and evaluation: depth of operation and light conditions, together with water turbidity and external weather, heavily impact on perception capabilities. The vision platform proposed in this work is a modular system comprising off-the-shelf components for both the imaging sensors and the computational unit, linked by a high performance ethernet network bus. The adopted design philosophy aims at achieving high flexibility in terms of feasible perception applications, that should not be as limited as in case of a special-purpose and dedicated hardware. Flexibility is required by the variability of underwater environments, with water conditions ranging from clear to turbid, light backscattering varying with daylight and depth, strong color distortion, and other environmental factors. Furthermore, the proposed modular design ensures an easier maintenance and update of the system over time. Performance of the proposed system, in terms of perception capabilities, has been evaluated in several underwater contexts taking advantage of the opportunity offered by the MARIS national project. Design issues like energy power consumption, heat dissipation and network capabilities have been evaluated in different scenarios. Finally, real-world experiments, conducted in multiple and variable underwater contexts, including open sea waters, have led to the collection of several datasets that have been publicly released to the scientific community. The vision system has been integrated in a state of the art AUV equipped with a robotic arm and gripper, and has been exploited in the robot control loop to successfully perform underwater grasping operations.
Resumo:
In emergency situations, where time for blood transfusion is reduced, the O negative blood type (the universal donor) is administrated. However, sometimes even the universal donor can cause transfusion reactions that can be fatal to the patient. As commercial systems do not allow fast results and are not suitable for emergency situations, this paper presents the steps considered for the development and validation of a prototype, able to determine blood type compatibilities, even in emergency situations. Thus it is possible, using the developed system, to administer a compatible blood type, since the first blood unit transfused. In order to increase the system’s reliability, this prototype uses different approaches to classify blood types, the first of which is based on Decision Trees and the second one based on support vector machines. The features used to evaluate these classifiers are the standard deviation values, histogram, Histogram of Oriented Gradients and fast Fourier transform, computed on different regions of interest. The main characteristics of the presented prototype are small size, lightweight, easy transportation, ease of use, fast results, high reliability and low cost. These features are perfectly suited for emergency scenarios, where the prototype is expected to be used.
Resumo:
Sol-gel-synthesized bioactive glasses may be formed via a hydrolysis condensation reaction, silica being introduced in the form of tetraethyl orthosilicate (TEOS), and calcium is typically added in the form of calcium nitrate. The synthesis reaction proceeds in an aqueous environment; the resultant gel is dried, before stabilization by heat treatment. These materials, being amorphous, are complex at the level of their atomic-scale structure, but their bulk properties may only be properly understood on the basis of that structural insight. Thus, a full understanding of their structure-property relationship may only be achieved through the application of a coherent suite of leading-edge experimental probes, coupled with the cogent use of advanced computer simulation methods. Using as an exemplar a calcia-silica sol-gel glass of the kind developed by Larry Hench, in the memory of whom this paper is dedicated, we illustrate the successful use of high-energy X-ray and neutron scattering (diffraction) methods, magic-angle spinning solid-state NMR, and molecular dynamics simulation as components to a powerful methodology for the study of amorphous materials.
Resumo:
Corwin and Wilcox (1985) sent surveys to more than 100 American colleges and universities to determine the policies on the matter of accepting American Sign Language (ASL) as a foreign language. Their results indicated that 81% of those surveyed rejected ASL as a foreign/modern language equivalent. The most frequently stated opposition to ASL was that it lacked a culture. Some of the other objections to ASL were: ASL is not foreign; there is no written form and therefore no original body of literature; it is a derivative of English; and it is indigenous to the United States and hence not foreign. Based on the work of Corwin and Wilcox this study sent surveys to 222 American colleges and universities. Noting an expanding cognizance and social awareness of ASL and deafness (as seen in the increasing number of movies, plays, television programs, the Americans with Disabilities Act, and related news stories), this study sought to find out if ASL was now considered an acceptable foreign language equivalent. The hypothesis of this study was that change has occurred since the 1985 study: that a significant percent of post secondary schools accepting ASL as a foreign/modern language equivalent has increased. The 165 colleges and universities that responded to this author's survey confirmed there has been a significant shift towards the acceptance of ASL. Only 50% of the respondents objected to ASL as a foreign language equivalent, a significant decrease from the 1985 findings. Of those who objected to granting ASL foreign language credit, the reasons were similar to those of the Corwin and Wilcox study, except that the belief in an absence of a Deaf culture dropped from the top reason listed, to the fifth. That ASL is not foreign was listed as the most frequent objection in this study. One important change which may account for increased acceptance of ASL, is that 16 states (compared to 10 in 1985) now have policies stating that ASL is acceptable as a foreign language equivalent. Two-year colleges, in this study, were more likely to accept ASL than were four-year colleges and universities. Neither two- nor four-year colleges and universities are likely to include ASL in their foreign language departments, and most schools that have foreign language entrance requirements are unlikely to accept ASL. In colleges and universities where ASL was already offered in some department within the system, there was a significantly higher likelihood that foreign language credit was given for ASL. Respondents from states with laws governing the inclusion of ASL did not usually know their state had a policy. Most respondents, 84%, indicated their knowledge on the topic of ASL was fair to poor. ^
Resumo:
Globally, approximately 208 million people aged 15 and older used illicit drugs at least once in the last 12 months; 2 billion consumed alcohol and tobacco consumption affected 25% (World Drug Report, 2008). In the United States, 20.1 million (8.0%) people aged 12 and older were illicit drug users, 129 million (51.6%) abused alcohol and 70.9 million (28.4%) used tobacco (SAMHSA/OAS, 2008).Usually considered a problem specific to men (Lynch, 2002), 5.2% of pregnant women aged 15 to 44 are also illicit drug and substance abusers (SAMHSA/OAS, 2007). During pregnancy, illicit drugs and substance abuse (ID/SA) can significantly affect a woman and her infant contributing to developmental and communication delays for the infant and influencing parenting abilities (Budden, 1996; March of Dimes, 2006b; Rossetti, 2000). Feelings of guilt and shame and stressful experiences influence approaches to parenting (Ashley, Marsden, & Brady, 2003; Brazelton, & Greenspan, 2000; Ehrmin, 2000; Johnson, & Rosen, 1990; Kelley, 1998; Rossetti, 2000; Velez et al., 2004; Zickler, 1999). Parenthood is an expanded role that can be a trying time for those lacking a sense of self-efficacy and creates a high vulnerability to stress (Bandura, 1994). Residential treatment programs for ID/SA mothers and their children provide an excellent opportunity for effective interventions (Finkelstein, 1994; Social Care Institute for Excellence, 2005). This experimental study evaluated whether teaching American Sign Language (ASL) to mothers living with their infants/children at an ID/SA residential treatment program increased the mothers’ self-efficacy and decreased their anxiety. Quantitative data were collected using the General Self-Efficacy Scale and the State-Trait Anxiety Inventory showing there was both a significant increase in self efficacy and decrease in anxiety for the mothers. This research adds to the knowledge base concerning ID/SA mothers’ caring for their infants/children. By providing a simple low cost program, easily incorporated into existing rehabilitation curricula, the study helps educators and healthcare providers better understand the needs of the ID/SA mothers. This study supports Bandura’s theory that parents who are secure in their efficacy can navigate through the various phases of their child’s development and are less vulnerable to stress (Bandura, 1994).
Resumo:
Globally, approximately 208 million people aged 15 and older used illicit drugs at least once in the last 12 months; 2 billion consumed alcohol and tobacco consumption affected 25% (World Drug Report, 2008). In the United States, 20.1 million (8.0%) people aged 12 and older were illicit drug users, 129 million (51.6%) abused alcohol and 70.9 million (28.4%) used tobacco (SAMHSA/OAS, 2008).Usually considered a problem specific to men (Lynch, 2002), 5.2% of pregnant women aged 15 to 44 are also illicit drug and substance abusers (SAMHSA/OAS, 2007). During pregnancy, illicit drugs and substance abuse (ID/SA) can significantly affect a woman and her infant contributing to developmental and communication delays for the infant and influencing parenting abilities (Budden, 1996; March of Dimes, 2006b; Rossetti, 2000). Feelings of guilt and shame and stressful experiences influence approaches to parenting (Ashley, Marsden, & Brady, 2003; Brazelton, & Greenspan, 2000; Ehrmin, 2000; Johnson, & Rosen, 1990; Kelley, 1998; Rossetti, 2000; Velez et al., 2004; Zickler, 1999). Parenthood is an expanded role that can be a trying time for those lacking a sense of self-efficacy and creates a high vulnerability to stress (Bandura, 1994). Residential treatment programs for ID/SA mothers and their children provide an excellent opportunity for effective interventions (Finkelstein, 1994; Social Care Institute for Excellence, 2005). This experimental study evaluated whether teaching American Sign Language (ASL) to mothers living with their infants/children at an ID/SA residential treatment program increased the mothers’ self-efficacy and decreased their anxiety. Quantitative data were collected using the General Self-Efficacy Scale and the State-Trait Anxiety Inventory showing there was both a significant increase in self efficacy and decrease in anxiety for the mothers. This research adds to the knowledge base concerning ID/SA mothers’ caring for their infants/children. By providing a simple low cost program, easily incorporated into existing rehabilitation curricula, the study helps educators and healthcare providers better understand the needs of the ID/SA mothers. This study supports Bandura’s theory that parents who are secure in their efficacy can navigate through the various phases of their child’s development and are less vulnerable to stress (Bandura, 1994).
Resumo:
The individual effects that echoic, mand, and sign language training procedures have on the acquisition of verbal behavior have been widely demonstrated, but more efficient strategies are still needed. This study combined all three treatment strategies into one treatment intervention in order to investigate the joint effects they may have on verbal behavior. Six participants took part in the study. Intervention totaled 1 hour/day for 5 days/week until mastery criterion for motor echoic behavior was achieved. Although motor echoic behavior were solely targeted for acquisition, significant increases in spontaneous motor mands were noted in all treatment participants. Additionally, 4 treatment participants also demonstrated significant gains in vocal echoics and spontaneous vocal mands. No significant increases were noted for the control participant. Results suggest that the aforementioned procedure may provide more efficient results as a first-step to teaching a functional repertoire of verbal behavior to developmentally delayed children.