883 resultados para OpenCV Computer Vision Object Detection Automatic Counting
Resumo:
This book will serve as a foundation for a variety of useful applications of graph theory to computer vision, pattern recognition, and related areas. It covers a representative set of novel graph-theoretic methods for complex computer vision and pattern recognition tasks. The first part of the book presents the application of graph theory to low-level processing of digital images such as a new method for partitioning a given image into a hierarchy of homogeneous areas using graph pyramids, or a study of the relationship between graph theory and digital topology. Part II presents graph-theoretic learning algorithms for high-level computer vision and pattern recognition applications, including a survey of graph based methodologies for pattern recognition and computer vision, a presentation of a series of computationally efficient algorithms for testing graph isomorphism and related graph matching tasks in pattern recognition and a new graph distance measure to be used for solving graph matching problems. Finally, Part III provides detailed descriptions of several applications of graph-based methods to real-world pattern recognition tasks. It includes a critical review of the main graph-based and structural methods for fingerprint classification, a new method to visualize time series of graphs, and potential applications in computer network monitoring and abnormal event detection.
Resumo:
Background: Individuals with type 1 diabetes (T1D) have to count the carbohydrates (CHOs) of their meal to estimate the prandial insulin dose needed to compensate for the meal’s effect on blood glucose levels. CHO counting is very challenging but also crucial, since an error of 20 grams can substantially impair postprandial control. Method: The GoCARB system is a smartphone application designed to support T1D patients with CHO counting of nonpacked foods. In a typical scenario, the user places a reference card next to the dish and acquires 2 images with his/her smartphone. From these images, the plate is detected and the different food items on the plate are automatically segmented and recognized, while their 3D shape is reconstructed. Finally, the food volumes are calculated and the CHO content is estimated by combining the previous results and using the USDA nutritional database. Results: To evaluate the proposed system, a set of 24 multi-food dishes was used. For each dish, 3 pairs of images were taken and for each pair, the system was applied 4 times. The mean absolute percentage error in CHO estimation was 10 ± 12%, which led to a mean absolute error of 6 ± 8 CHO grams for normal-sized dishes. Conclusion: The laboratory experiments demonstrated the feasibility of the GoCARB prototype system since the error was below the initial goal of 20 grams. However, further improvements and evaluation are needed prior launching a system able to meet the inter- and intracultural eating habits.
Resumo:
This paper outlines an automatic computervision system for the identification of avena sterilis which is a special weed seed growing in cereal crops. The final goal is to reduce the quantity of herbicide to be sprayed as an important and necessary step for precision agriculture. So, only areas where the presence of weeds is important should be sprayed. The main problems for the identification of this kind of weed are its similar spectral signature with respect the crops and also its irregular distribution in the field. It has been designed a new strategy involving two processes: image segmentation and decision making. The image segmentation combines basic suitable image processing techniques in order to extract cells from the image as the low level units. Each cell is described by two area-based attributes measuring the relations among the crops and weeds. The decision making is based on the SupportVectorMachines and determines if a cell must be sprayed. The main findings of this paper are reflected in the combination of the segmentation and the SupportVectorMachines decision processes. Another important contribution of this approach is the minimum requirements of the system in terms of memory and computation power if compared with other previous works. The performance of the method is illustrated by comparative analysis against some existing strategies.
Resumo:
A spatial-color-based non-parametric background-foreground modeling strategy in a GPGPU by using CUDA is proposed. This strategy is suitable for augmented-reality applications, providing real-time high-quality results in a great variety of scenarios.
Resumo:
Electronic devices endowed with camera platforms require new and powerful machine vision applications, which commonly include moving object detection strategies. To obtain high-quality results, the most recent strategies estimate nonparametrically background and foreground models and combine them by means of a Bayesian classifier. However, typical classifiers are limited by the use of constant prior values and they do not allow the inclusion of additional spatiodependent prior information. In this Letter, we propose an alternative Bayesian classifier that, unlike those reported before, allows the use of additional prior information obtained from any source and depending on the spatial position of each pixel.
Resumo:
A novel and high-quality system for moving object detection in sequences recorded with moving cameras is proposed. This system is based on the collaboration between an automatic homography estimation module for image alignment, and a robust moving object detection using an efficient spatiotemporal nonparametric background modeling.
Resumo:
La evolución de los teléfonos móviles inteligentes, dotados de cámaras digitales, está provocando una creciente demanda de aplicaciones cada vez más complejas que necesitan algoritmos de visión artificial en tiempo real; puesto que el tamaño de las señales de vídeo no hace sino aumentar y en cambio el rendimiento de los procesadores de un solo núcleo se ha estancado, los nuevos algoritmos que se diseñen para visión artificial han de ser paralelos para poder ejecutarse en múltiples procesadores y ser computacionalmente escalables. Una de las clases de procesadores más interesantes en la actualidad se encuentra en las tarjetas gráficas (GPU), que son dispositivos que ofrecen un alto grado de paralelismo, un excelente rendimiento numérico y una creciente versatilidad, lo que los hace interesantes para llevar a cabo computación científica. En esta tesis se exploran dos aplicaciones de visión artificial que revisten una gran complejidad computacional y no pueden ser ejecutadas en tiempo real empleando procesadores tradicionales. En cambio, como se demuestra en esta tesis, la paralelización de las distintas subtareas y su implementación sobre una GPU arrojan los resultados deseados de ejecución con tasas de refresco interactivas. Asimismo, se propone una técnica para la evaluación rápida de funciones de complejidad arbitraria especialmente indicada para su uso en una GPU. En primer lugar se estudia la aplicación de técnicas de síntesis de imágenes virtuales a partir de únicamente dos cámaras lejanas y no paralelas—en contraste con la configuración habitual en TV 3D de cámaras cercanas y paralelas—con información de color y profundidad. Empleando filtros de mediana modificados para la elaboración de un mapa de profundidad virtual y proyecciones inversas, se comprueba que estas técnicas son adecuadas para una libre elección del punto de vista. Además, se demuestra que la codificación de la información de profundidad con respecto a un sistema de referencia global es sumamente perjudicial y debería ser evitada. Por otro lado se propone un sistema de detección de objetos móviles basado en técnicas de estimación de densidad con funciones locales. Este tipo de técnicas es muy adecuada para el modelado de escenas complejas con fondos multimodales, pero ha recibido poco uso debido a su gran complejidad computacional. El sistema propuesto, implementado en tiempo real sobre una GPU, incluye propuestas para la estimación dinámica de los anchos de banda de las funciones locales, actualización selectiva del modelo de fondo, actualización de la posición de las muestras de referencia del modelo de primer plano empleando un filtro de partículas multirregión y selección automática de regiones de interés para reducir el coste computacional. Los resultados, evaluados sobre diversas bases de datos y comparados con otros algoritmos del estado del arte, demuestran la gran versatilidad y calidad de la propuesta. Finalmente se propone un método para la aproximación de funciones arbitrarias empleando funciones continuas lineales a tramos, especialmente indicada para su implementación en una GPU mediante el uso de las unidades de filtraje de texturas, normalmente no utilizadas para cómputo numérico. La propuesta incluye un riguroso análisis matemático del error cometido en la aproximación en función del número de muestras empleadas, así como un método para la obtención de una partición cuasióptima del dominio de la función para minimizar el error. ABSTRACT The evolution of smartphones, all equipped with digital cameras, is driving a growing demand for ever more complex applications that need to rely on real-time computer vision algorithms. However, video signals are only increasing in size, whereas the performance of single-core processors has somewhat stagnated in the past few years. Consequently, new computer vision algorithms will need to be parallel to run on multiple processors and be computationally scalable. One of the most promising classes of processors nowadays can be found in graphics processing units (GPU). These are devices offering a high parallelism degree, excellent numerical performance and increasing versatility, which makes them interesting to run scientific computations. In this thesis, we explore two computer vision applications with a high computational complexity that precludes them from running in real time on traditional uniprocessors. However, we show that by parallelizing subtasks and implementing them on a GPU, both applications attain their goals of running at interactive frame rates. In addition, we propose a technique for fast evaluation of arbitrarily complex functions, specially designed for GPU implementation. First, we explore the application of depth-image–based rendering techniques to the unusual configuration of two convergent, wide baseline cameras, in contrast to the usual configuration used in 3D TV, which are narrow baseline, parallel cameras. By using a backward mapping approach with a depth inpainting scheme based on median filters, we show that these techniques are adequate for free viewpoint video applications. In addition, we show that referring depth information to a global reference system is ill-advised and should be avoided. Then, we propose a background subtraction system based on kernel density estimation techniques. These techniques are very adequate for modelling complex scenes featuring multimodal backgrounds, but have not been so popular due to their huge computational and memory complexity. The proposed system, implemented in real time on a GPU, features novel proposals for dynamic kernel bandwidth estimation for the background model, selective update of the background model, update of the position of reference samples of the foreground model using a multi-region particle filter, and automatic selection of regions of interest to reduce computational cost. The results, evaluated on several databases and compared to other state-of-the-art algorithms, demonstrate the high quality and versatility of our proposal. Finally, we propose a general method for the approximation of arbitrarily complex functions using continuous piecewise linear functions, specially formulated for GPU implementation by leveraging their texture filtering units, normally unused for numerical computation. Our proposal features a rigorous mathematical analysis of the approximation error in function of the number of samples, as well as a method to obtain a suboptimal partition of the domain of the function to minimize approximation error.
Resumo:
Strawberries harvested for processing as frozen fruits are currently de-calyxed manually in the field. This process requires the removal of the stem cap with green leaves (i.e. the calyx) and incurs many disadvantages when performed by hand. Not only does it necessitate the need to maintain cutting tool sanitation, but it also increases labor time and exposure of the de-capped strawberries before in-plant processing. This leads to labor inefficiency and decreased harvest yield. By moving the calyx removal process from the fields to the processing plants, this new practice would reduce field labor and improve management and logistics, while increasing annual yield. As labor prices continue to increase, the strawberry industry has shown great interest in the development and implementation of an automated calyx removal system. In response, this dissertation describes the design, operation, and performance of a full-scale automatic vision-guided intelligent de-calyxing (AVID) prototype machine. The AVID machine utilizes commercially available equipment to produce a relatively low cost automated de-calyxing system that can be retrofitted into existing food processing facilities. This dissertation is broken up into five sections. The first two sections include a machine overview and a 12-week processing plant pilot study. Results of the pilot study indicate the AVID machine is able to de-calyx grade-1-with-cap conical strawberries at roughly 66 percent output weight yield at a throughput of 10,000 pounds per hour. The remaining three sections describe in detail the three main components of the machine: a strawberry loading and orientation conveyor, a machine vision system for calyx identification, and a synchronized multi-waterjet knife calyx removal system. In short, the loading system utilizes rotational energy to orient conical strawberries. The machine vision system determines cut locations through RGB real-time feature extraction. The high-speed multi-waterjet knife system uses direct drive actuation to locate 30,000 psi cutting streams to precise coordinates for calyx removal. Based on the observations and studies performed within this dissertation, the AVID machine is seen to be a viable option for automated high-throughput strawberry calyx removal. A summary of future tasks and further improvements is discussed at the end.
Resumo:
Nowadays despite improvements in usability and intuitiveness users have to adapt to the proposed systems to satisfy their needs. For instance, they must learn how to achieve tasks, how to interact with the system, and fulfill system's specifications. This paper proposes an approach to improve this situation enabling graphical user interface redefinition through virtualization and computer vision with the aim of increasing the system's usability. To achieve this goal the approach is based on enriched task models, virtualization and picture-driven computing.
Resumo:
The project aims at advancing the state of the art in the use of context information for classification of image and video data. The use of context in the classification of images has been showed of great importance to improve the performance of actual object recognition systems. In our project we proposed the concept of Multi-scale Feature Labels as a general and compact method to exploit the local and global context. The feature extraction from the discriminative probability or classification confidence label field is of great novelty. Moreover the use of a multi-scale representation of the feature labels lead to a compact and efficient description of the context. The goal of the project has been also to provide a general-purpose method and prove its suitability in different image/video analysis problem. The two-year project generated 5 journal publications (plus 2 under submission), 10 conference publications (plus 2 under submission) and one patent (plus 1 pending). Of these publications, a relevant number make use of the main result of this project to improve the results in detection and/or segmentation of objects.
Resumo:
Este trabajo se centra en el uso del lenguaje Python y la librería OpenCV de visión por computador para el seguimiento de crustáceos marinos en condiciones experimentales y determinar su comportamiento en un entorno social.
Resumo:
This work presents the implementation and comparison of three different techniques of three-dimensional computer vision as follows: • Stereo vision - correlation between two 2D images • Sensorial fusion - use of different sensors: camera 2D + ultrasound sensor (1D); • Structured light The computer vision techniques herein presented took into consideration the following characteristics: • Computational effort ( elapsed time for obtain the 3D information); • Influence of environmental conditions (noise due to a non uniform lighting, overlighting and shades); • The cost of the infrastructure for each technique; • Analysis of uncertainties, precision and accuracy. The option of using the Matlab software, version 5.1, for algorithm implementation of the three techniques was due to the simplicity of their commands, programming and debugging. Besides, this software is well known and used by the academic community, allowing the results of this work to be obtained and verified. Examples of three-dimensional vision applied to robotic assembling tasks ("pick-and-place") are presented.