875 resultados para Computer Vision for Robotics and Automation
Resumo:
Objective: To define and evaluate a Computer-Vision (CV) method for scoring Paced Finger-Tapping (PFT) in Parkinson's disease (PD) using quantitative motion analysis of index-fingers and to compare the obtained scores to the UPDRS (Unified Parkinson's Disease Rating Scale) finger-taps (FT). Background: The naked-eye evaluation of PFT in clinical practice results in coarse resolution to determine PD status. Besides, sensor mechanisms for PFT evaluation may cause patients discomfort. In order to avoid cost and effort of applying wearable sensors, a CV system for non-invasive PFT evaluation is introduced. Methods: A database of 221 PFT videos from 6 PD patients was processed. The subjects were instructed to position their hands above their shoulders besides the face and tap the index-finger against the thumb consistently with speed. They were facing towards a pivoted camera during recording. The videos were rated by two clinicians between symptom levels 0-to-3 using UPDRS-FT. The CV method incorporates a motion analyzer and a face detector. The method detects the face of testee in each video-frame. The frame is split into two images from face-rectangle center. Two regions of interest are located in each image to detect index-finger motion of left and right hands respectively. The tracking of opening and closing phases of dominant hand index-finger produces a tapping time-series. This time-series is normalized by the face height. The normalization calibrates the amplitude in tapping signal which is affected by the varying distance between camera and subject (farther the camera, lesser the amplitude). A total of 15 features were classified using K-nearest neighbor (KNN) classifier to characterize the symptoms levels in UPDRS-FT. The target ratings provided by the raters were averaged. Results: A 10-fold cross validation in KNN classified 221 videos between 3 symptom levels with 75% accuracy. An area under the receiver operating characteristic curves of 82.6% supports feasibility of the obtained features to replicate clinical assessments. Conclusions: The system is able to track index-finger motion to estimate tapping symptoms in PD. It has certain advantages compared to other technologies (e.g. magnetic sensors, accelerometers etc.) for PFT evaluation to improve and automate the ratings
Resumo:
Inferences about leaf anatomical characteristics had largely been made by manually measuring diverse leaf regions, such as cuticle, epidermis and parenchyma to evaluate differences caused by environmental variables. Here we tested an approach for data acquisition and analysis in ecological quantitative leaf anatomy studies based on computer vision and pattern recognition methods. A case study was conducted on Gochnatia polymorpha (Less.) Cabrera (Asteraceae), a Neotropical savanna tree species that has high phenotypic plasticity. We obtained digital images of cross-sections of its leaves developed under different light conditions (sun vs. shade), different seasons (dry vs. wet) and in different soil types (oxysoil vs. hydromorphic soil), and analyzed several visual attributes, such as color, texture and tissues thickness in a perpendicular plane from microscopic images. The experimental results demonstrated that computational analysis is capable of distinguishing anatomical alterations in microscope images obtained from individuals growing in different environmental conditions. The methods presented here offer an alternative way to determine leaf anatomical differences. © 2013 Elsevier B.V.
Resumo:
[EN]This paper describes a low-cost system that allows the user to visualize different glasses models in live video. The user can also move the glasses to adjust its position on the face. The system, which runs at 9.5 frames/s on general-purpose hardware, has a homeostatic module that keeps image parameters controlled. This is achieved by using a camera with motorized zoom, iris, white balance, etc. This feature can be specially useful in environments with changing illumination and shadows, like in an optical shop. The system also includes a face and eye detection module and a glasses management module.
Resumo:
[ES]This paper describes some simple but useful computer vision techniques for human-robot interaction. First, an omnidirectional camera setting is described that can detect people in the surroundings of the robot, giving their angular positions and a rough estimate of the distance. The device can be easily built with inexpensive components. Second, we comment on a color-based face detection technique that can alleviate skin-color false positives. Third, a simple head nod and shake detector is described, suitable for detecting affirmative/negative, approval/dissaproval, understanding/disbelief head gestures.
Resumo:
Background: Individuals with type 1 diabetes (T1D) have to count the carbohydrates (CHOs) of their meal to estimate the prandial insulin dose needed to compensate for the meal’s effect on blood glucose levels. CHO counting is very challenging but also crucial, since an error of 20 grams can substantially impair postprandial control. Method: The GoCARB system is a smartphone application designed to support T1D patients with CHO counting of nonpacked foods. In a typical scenario, the user places a reference card next to the dish and acquires 2 images with his/her smartphone. From these images, the plate is detected and the different food items on the plate are automatically segmented and recognized, while their 3D shape is reconstructed. Finally, the food volumes are calculated and the CHO content is estimated by combining the previous results and using the USDA nutritional database. Results: To evaluate the proposed system, a set of 24 multi-food dishes was used. For each dish, 3 pairs of images were taken and for each pair, the system was applied 4 times. The mean absolute percentage error in CHO estimation was 10 ± 12%, which led to a mean absolute error of 6 ± 8 CHO grams for normal-sized dishes. Conclusion: The laboratory experiments demonstrated the feasibility of the GoCARB prototype system since the error was below the initial goal of 20 grams. However, further improvements and evaluation are needed prior launching a system able to meet the inter- and intracultural eating habits.
Resumo:
Rho guanosine triphosphatases (GTPases) control the cytoskeletal dynamics that power neurite outgrowth. This process consists of dynamic neurite initiation, elongation, retraction, and branching cycles that are likely to be regulated by specific spatiotemporal signaling networks, which cannot be resolved with static, steady-state assays. We present NeuriteTracker, a computer-vision approach to automatically segment and track neuronal morphodynamics in time-lapse datasets. Feature extraction then quantifies dynamic neurite outgrowth phenotypes. We identify a set of stereotypic neurite outgrowth morphodynamic behaviors in a cultured neuronal cell system. Systematic RNA interference perturbation of a Rho GTPase interactome consisting of 219 proteins reveals a limited set of morphodynamic phenotypes. As proof of concept, we show that loss of function of two distinct RhoA-specific GTPase-activating proteins (GAPs) leads to opposite neurite outgrowth phenotypes. Imaging of RhoA activation dynamics indicates that both GAPs regulate different spatiotemporal Rho GTPase pools, with distinct functions. Our results provide a starting point to dissect spatiotemporal Rho GTPase signaling networks that regulate neurite outgrowth.
Resumo:
This paper outlines an automatic computervision system for the identification of avena sterilis which is a special weed seed growing in cereal crops. The final goal is to reduce the quantity of herbicide to be sprayed as an important and necessary step for precision agriculture. So, only areas where the presence of weeds is important should be sprayed. The main problems for the identification of this kind of weed are its similar spectral signature with respect the crops and also its irregular distribution in the field. It has been designed a new strategy involving two processes: image segmentation and decision making. The image segmentation combines basic suitable image processing techniques in order to extract cells from the image as the low level units. Each cell is described by two area-based attributes measuring the relations among the crops and weeds. The decision making is based on the SupportVectorMachines and determines if a cell must be sprayed. The main findings of this paper are reflected in the combination of the segmentation and the SupportVectorMachines decision processes. Another important contribution of this approach is the minimum requirements of the system in terms of memory and computation power if compared with other previous works. The performance of the method is illustrated by comparative analysis against some existing strategies.
Resumo:
La evolución de los teléfonos móviles inteligentes, dotados de cámaras digitales, está provocando una creciente demanda de aplicaciones cada vez más complejas que necesitan algoritmos de visión artificial en tiempo real; puesto que el tamaño de las señales de vídeo no hace sino aumentar y en cambio el rendimiento de los procesadores de un solo núcleo se ha estancado, los nuevos algoritmos que se diseñen para visión artificial han de ser paralelos para poder ejecutarse en múltiples procesadores y ser computacionalmente escalables. Una de las clases de procesadores más interesantes en la actualidad se encuentra en las tarjetas gráficas (GPU), que son dispositivos que ofrecen un alto grado de paralelismo, un excelente rendimiento numérico y una creciente versatilidad, lo que los hace interesantes para llevar a cabo computación científica. En esta tesis se exploran dos aplicaciones de visión artificial que revisten una gran complejidad computacional y no pueden ser ejecutadas en tiempo real empleando procesadores tradicionales. En cambio, como se demuestra en esta tesis, la paralelización de las distintas subtareas y su implementación sobre una GPU arrojan los resultados deseados de ejecución con tasas de refresco interactivas. Asimismo, se propone una técnica para la evaluación rápida de funciones de complejidad arbitraria especialmente indicada para su uso en una GPU. En primer lugar se estudia la aplicación de técnicas de síntesis de imágenes virtuales a partir de únicamente dos cámaras lejanas y no paralelas—en contraste con la configuración habitual en TV 3D de cámaras cercanas y paralelas—con información de color y profundidad. Empleando filtros de mediana modificados para la elaboración de un mapa de profundidad virtual y proyecciones inversas, se comprueba que estas técnicas son adecuadas para una libre elección del punto de vista. Además, se demuestra que la codificación de la información de profundidad con respecto a un sistema de referencia global es sumamente perjudicial y debería ser evitada. Por otro lado se propone un sistema de detección de objetos móviles basado en técnicas de estimación de densidad con funciones locales. Este tipo de técnicas es muy adecuada para el modelado de escenas complejas con fondos multimodales, pero ha recibido poco uso debido a su gran complejidad computacional. El sistema propuesto, implementado en tiempo real sobre una GPU, incluye propuestas para la estimación dinámica de los anchos de banda de las funciones locales, actualización selectiva del modelo de fondo, actualización de la posición de las muestras de referencia del modelo de primer plano empleando un filtro de partículas multirregión y selección automática de regiones de interés para reducir el coste computacional. Los resultados, evaluados sobre diversas bases de datos y comparados con otros algoritmos del estado del arte, demuestran la gran versatilidad y calidad de la propuesta. Finalmente se propone un método para la aproximación de funciones arbitrarias empleando funciones continuas lineales a tramos, especialmente indicada para su implementación en una GPU mediante el uso de las unidades de filtraje de texturas, normalmente no utilizadas para cómputo numérico. La propuesta incluye un riguroso análisis matemático del error cometido en la aproximación en función del número de muestras empleadas, así como un método para la obtención de una partición cuasióptima del dominio de la función para minimizar el error. ABSTRACT The evolution of smartphones, all equipped with digital cameras, is driving a growing demand for ever more complex applications that need to rely on real-time computer vision algorithms. However, video signals are only increasing in size, whereas the performance of single-core processors has somewhat stagnated in the past few years. Consequently, new computer vision algorithms will need to be parallel to run on multiple processors and be computationally scalable. One of the most promising classes of processors nowadays can be found in graphics processing units (GPU). These are devices offering a high parallelism degree, excellent numerical performance and increasing versatility, which makes them interesting to run scientific computations. In this thesis, we explore two computer vision applications with a high computational complexity that precludes them from running in real time on traditional uniprocessors. However, we show that by parallelizing subtasks and implementing them on a GPU, both applications attain their goals of running at interactive frame rates. In addition, we propose a technique for fast evaluation of arbitrarily complex functions, specially designed for GPU implementation. First, we explore the application of depth-image–based rendering techniques to the unusual configuration of two convergent, wide baseline cameras, in contrast to the usual configuration used in 3D TV, which are narrow baseline, parallel cameras. By using a backward mapping approach with a depth inpainting scheme based on median filters, we show that these techniques are adequate for free viewpoint video applications. In addition, we show that referring depth information to a global reference system is ill-advised and should be avoided. Then, we propose a background subtraction system based on kernel density estimation techniques. These techniques are very adequate for modelling complex scenes featuring multimodal backgrounds, but have not been so popular due to their huge computational and memory complexity. The proposed system, implemented in real time on a GPU, features novel proposals for dynamic kernel bandwidth estimation for the background model, selective update of the background model, update of the position of reference samples of the foreground model using a multi-region particle filter, and automatic selection of regions of interest to reduce computational cost. The results, evaluated on several databases and compared to other state-of-the-art algorithms, demonstrate the high quality and versatility of our proposal. Finally, we propose a general method for the approximation of arbitrarily complex functions using continuous piecewise linear functions, specially formulated for GPU implementation by leveraging their texture filtering units, normally unused for numerical computation. Our proposal features a rigorous mathematical analysis of the approximation error in function of the number of samples, as well as a method to obtain a suboptimal partition of the domain of the function to minimize approximation error.
Resumo:
In this article, we present a new framework oriented to teach Computer Vision related subjects called JavaVis. It is a computer vision library divided in three main areas: 2D package is featured for classical computer vision processing; 3D package, which includes a complete 3D geometric toolset, is used for 3D vision computing; Desktop package comprises a tool for graphic designing and testing of new algorithms. JavaVis is designed to be easy to use, both for launching and testing existing algorithms and for developing new ones.
Resumo:
Machine vision is an important subject in computer science and engineering degrees. For laboratory experimentation, it is desirable to have a complete and easy-to-use tool. In this work we present a Java library, oriented to teaching computer vision. We have designed and built the library from the scratch with enfasis on readability and understanding rather than on efficiency. However, the library can also be used for research purposes. JavaVis is an open source Java library, oriented to the teaching of Computer Vision. It consists of a framework with several features that meet its demands. It has been designed to be easy to use: the user does not have to deal with internal structures or graphical interface, and should the student need to add a new algorithm it can be done simply enough. Once we sketch the library, we focus on the experience the student gets using this library in several computer vision courses. Our main goal is to find out whether the students understand what they are doing, that is, find out how much the library helps the student in grasping the basic concepts of computer vision. In the last four years we have conducted surveys to assess how much the students have improved their skills by using this library.
Resumo:
Mode of access: Internet.
Resumo:
This paper describes the real time global vision system for the robot soccer team the RoboRoos. It has a highly optimised pipeline that includes thresholding, segmenting, colour normalising, object recognition and perspective and lens correction. It has a fast ‘paint’ colour calibration system that can calibrate in any face of the YUV or HSI cube. It also autonomously selects both an appropriate camera gain and colour gains robot regions across the field to achieve colour uniformity. Camera geometry calibration is performed automatically from selection of keypoints on the field. The system acheives a position accuracy of better than 15mm over a 4m × 5.5m field, and orientation accuracy to within 1°. It processes 614 × 480 pixels at 60Hz on a 2.0GHz Pentium 4 microprocessor.
Resumo:
This paper presents the implementation of a modified particle filter for vision-based simultaneous localization and mapping of an autonomous robot in a structured indoor environment. Through this method, artificial landmarks such as multi-coloured cylinders can be tracked with a camera mounted on the robot, and the position of the robot can be estimated at the same time. Experimental results in simulation and in real environments show that this approach has advantages over the extended Kalman filter with ambiguous data association and various levels of odometric noise.
Resumo:
There have been two main approaches to feature detection in human and computer vision - luminance-based and energy-based. Bars and edges might arise from peaks of luminance and luminance gradient respectively, or bars and edges might be found at peaks of local energy, where local phases are aligned across spatial frequency. This basic issue of definition is important because it guides more detailed models and interpretations of early vision. Which approach better describes the perceived positions of elements in a 3-element contour-alignment task? We used the class of 1-D images defined by Morrone and Burr in which the amplitude spectrum is that of a (partially blurred) square wave and Fourier components in a given image have a common phase. Observers judged whether the centre element (eg ±458 phase) was to the left or right of the flanking pair (eg 0º phase). Lateral offset of the centre element was varied to find the point of subjective alignment from the fitted psychometric function. This point shifted systematically to the left or right according to the sign of the centre phase, increasing with the degree of blur. These shifts were well predicted by the location of luminance peaks and other derivative-based features, but not by energy peaks which (by design) predicted no shift at all. These results on contour alignment agree well with earlier ones from a more explicit feature-marking task, and strongly suggest that human vision does not use local energy peaks to locate basic first-order features. [Supported by the Wellcome Trust (ref: 056093)]
Resumo:
In response to the increasing international competitiveness, many manufacturing businesses are rethinking their management strategies and philosophies towards achieving a computer integrated environment. The explosive growth in Advanced Manufacturing Technology (AMI) has resulted in the formation of functional "Islands of Automation" such as Computer Aided Design (CAD), Computer Aided Manufacturing (CAM), Computer Aided Process Planning (CAPP) and Manufacturing Resources Planning (MRPII). This has resulted in an environment which has focussed areas of excellence and poor overall efficiency, co-ordination and control. The main role of Computer Integrated Manufacturing (CIM) is to integrate these islands of automation and develop a totally integrated and controlled environment. However, the various perceptions of CIM, although developing, remain focussed on a very narrow integration scope and have consequently resulted in mere linked islands of automation with little improvement in overall co-ordination and control. This thesis, that is the research described within, develops and examines a more holistic view of CIM, which is based on the integration of various business elements. One particular business element, namely control, has been shown to have a multi-facetted and underpinning relationship with the CIM philosophy. This relationship impacts various CIM system design aspects including the CIM business analysis and modelling technique, the specification of systems integration requirements, the CIM system architectural form and the degree of business redesign. The research findings show that fundamental changes to CIM system design are required; these are incorporated in a generic CIM design methodology. The affect and influence of this holistic view of CIM on a manufacturing business has been evaluated through various industrial case study applications. Based on the evidence obtained, it has been concluded that this holistic, control based approach to CIM can provide a greatly improved means of achieving a totally integrated and controlled business environment. This generic CIM methodology will therefore make a significant contribution to the planning, modelling, design and development of future CIM systems.