888 resultados para Computer vision system
Resumo:
This paper presents a computer vision system that successfully discriminates between weed patches and crop rows under uncontrolled lighting in real-time. The system consists of two independent subsystems, a fast image processing delivering results in real-time (Fast Image Processing, FIP), and a slower and more accurate processing (Robust Crop Row Detection, RCRD) that is used to correct the first subsystem's mistakes. This combination produces a system that achieves very good results under a wide variety of conditions. Tested on several maize videos taken of different fields and during different years, the system successfully detects an average of 95% of weeds and 80% of crops under different illumination, soil humidity and weed/crop growth conditions. Moreover, the system has been shown to produce acceptable results even under very difficult conditions, such as in the presence of dramatic sowing errors or abrupt camera movements. The computer vision system has been developed for integration into a treatment system because the ideal setup for any weed sprayer system would include a tool that could provide information on the weeds and crops present at each point in real-time, while the tractor mounting the spraying bar is moving
Resumo:
In this paper we tackle the problem of landing a helicopter autonomously on a ship deck, using as the main sensor, an on-board colour camera. To create a test-bed, we first adequately simulate the movement of a ship landing platform on the Sea, for different Sea States, for different ships, randomly and realistically enough. We use a commercial parallel robot to get this movement. Once we had this, we developed an accurate and robust computer vision system to measure the pose of the helipad with respect to the on-board camera. To deal with the noise and the possible fails of the computer vision, a state estimator was created. With all of this, we are now able to develop and test a controller that closes the loop and finish the autonomous landing task.
Resumo:
A major impediment to developing real-time computer vision systems has been the computational power and level of skill required to process video streams in real-time. This has meant that many researchers have either analysed video streams off-line or used expensive dedicated hardware acceleration techniques. Recent software and hardware developments have greatly eased the development burden of realtime image analysis leading to the development of portable systems using cheap PC hardware and software exploiting the Multimedia Extension (MMX) instruction set of the Intel Pentium chip. This paper describes the implementation of a computationally efficient computer vision system for recognizing hand gestures using efficient coding and MMX-acceleration to achieve real-time performance on low cost hardware.
Resumo:
A sizeable amount of the testing in eye care, requires either the identification of targets such as letters to assess functional vision, or the subjective evaluation of imagery by an examiner. Computers can render a variety of different targets on their monitors and can be used to store and analyse ophthalmic images. However, existing computing hardware tends to be large, screen resolutions are often too low, and objective assessments of ophthalmic images unreliable. Recent advances in mobile computing hardware and computer-vision systems can be used to enhance clinical testing in optometry. High resolution touch screens embedded in mobile devices, can render targets at a wide variety of distances and can be used to record and respond to patient responses, automating testing methods. This has opened up new opportunities in computerised near vision testing. Equally, new image processing techniques can be used to increase the validity and reliability of objective computer vision systems. Three novel apps for assessing reading speed, contrast sensitivity and amplitude of accommodation were created by the author to demonstrate the potential of mobile computing to enhance clinical measurement. The reading speed app could present sentences effectively, control illumination and automate the testing procedure for reading speed assessment. Meanwhile the contrast sensitivity app made use of a bit stealing technique and swept frequency target, to rapidly assess a patient’s full contrast sensitivity function at both near and far distances. Finally, customised electronic hardware was created and interfaced to an app on a smartphone device to allow free space amplitude of accommodation measurement. A new geometrical model of the tear film and a ray tracing simulation of a Placido disc topographer were produced to provide insights on the effect of tear film breakdown on ophthalmic images. Furthermore, a new computer vision system, that used a novel eye-lash segmentation technique, was created to demonstrate the potential of computer vision systems for the clinical assessment of tear stability. Studies undertaken by the author to assess the validity and repeatability of the novel apps, found that their repeatability was comparable to, or better, than existing clinical methods for reading speed and contrast sensitivity assessment. Furthermore, the apps offered reduced examination times in comparison to their paper based equivalents. The reading speed and amplitude of accommodation apps correlated highly with existing methods of assessment supporting their validity. Their still remains questions over the validity of using a swept frequency sine-wave target to assess patient’s contrast sensitivity functions as no clinical test provides the range of spatial frequencies and contrasts, nor equivalent assessment at distance and near. A validation study of the new computer vision system found that the authors tear metric correlated better with existing subjective measures of tear film stability than those of a competing computer-vision system. However, repeatability was poor in comparison to the subjective measures due to eye lash interference. The new mobile apps, computer vision system, and studies outlined in this thesis provide further insight into the potential of applying mobile and image processing technology to enhance clinical testing by eye care professionals.
Resumo:
[ES]El Instituto Universitario de Sistemas Inteligentes y Aplicaciones Numéricas en Ingeniería y en especial la División de Robótica y Oceanografía Computacional está desarrollando un velero autónomo de superficie que requiere de un sistema para la detección y evasión de obstáculos. Dicho sistema se ha desarrollado sobre una Raspberry Pi con un servicio para la captura de imágenes, así como un servidor web que permita la modificación de la configuración de la cámara. Una vez completada dicha infraestructura se tomaron las fotografías que conformarán el conjunto de entrenamiento para el sistema de visión por computador y se desarrollará este último. Los resultados se han integrado con el sistema del control modificando el rumbo cuando se detecte un obstáculo.
Resumo:
A cor da superfície dos alimentos é o primeiro parâmetro de qualidade avaliado pelos consumidores, e é critico para a aceitação do produto, então a medição adequada da cor é uma importante ferramenta. Nesta pesquisa avaliou-se a variação da cor em corvina (Micropogonias furnieri) armazenada em gelo durante 16 dias; os parâmetros de luminosidade (L*), valor cromático a*, valor cromático b*, variação total da cor (ΔE) e croma (C*) foram obtidos por sistema de visão computacional, e por colorímetro Konica Minolta CR-400. O frescor da corvina baseada nas mudanças da cor das brânquias foi avaliado utilizando um sistema de visão computacional. Também se modelou a oxidação da mioglobina em files de burriquete (Pogonias cromis), utilizando os parâmetros de vermelho (valor a* e R). Para registrar as mudanças da cor durante 57,6 h utilizou-se um sistema de visão computacional, a análise química realizou-se determinando a concentração de metamioglobina (%). Na avaliação da cor de corvina armazenada em gelo, o sistema de visão computacional mostrou diferenças significativas para L*, a*, ΔE e C*, enquanto que o colorímetro mostrou diferenças significativas para L* e ΔE, o único parâmetro que não apresentou diferenças entre instrumentos foi ΔE durante a avaliação da corvina armazenada em gelo. O coeficiente de correlação entre os parâmetros da cor (L*, a* e b*) das brânquias da corvina armazenada em gelo pelo tempo de armazenamento foi de 0,9747. O sistema de visão computacional registrou as mudanças da cor em filés de burriquete e se modelaram as mudanças utilizando um modelo exponencial. O sistema de visão computacional mostrou ser mais sensível às mudanças da cor durante a avaliação da cor na corvina armazenada em gelo. É possível prognosticar o tempo de armazenamento da corvina em gelo em função da mudança da cor das brânquias. Assim, foi possível modelar a variação da mioglobina em filés de burriquete utilizando sistemas de visão computacional para registrar ditas mudanças. Os sistemas de visão computacional têm grande capacidade para registrar as mudanças da cor e é possível utiliza-los para avaliar os alimentos em função da cor.
Resumo:
A computer vision system that has to interact in natural language needs to understand the visual appearance of interactions between objects along with the appearance of objects themselves. Relationships between objects are frequently mentioned in queries of tasks like semantic image retrieval, image captioning, visual question answering and natural language object detection. Hence, it is essential to model context between objects for solving these tasks. In the first part of this thesis, we present a technique for detecting an object mentioned in a natural language query. Specifically, we work with referring expressions which are sentences that identify a particular object instance in an image. In many referring expressions, an object is described in relation to another object using prepositions, comparative adjectives, action verbs etc. Our proposed technique can identify both the referred object and the context object mentioned in such expressions. Context is also useful for incrementally understanding scenes and videos. In the second part of this thesis, we propose techniques for searching for objects in an image and events in a video. Our proposed incremental algorithms use the context from previously explored regions to prioritize the regions to explore next. The advantage of incremental understanding is restricting the amount of computation time and/or resources spent for various detection tasks. Our first proposed technique shows how to learn context in indoor scenes in an implicit manner and use it for searching for objects. The second technique shows how explicitly written context rules of one-on-one basketball can be used to sequentially detect events in a game.
Resumo:
Hand gestures are a powerful way for human communication, with lots of potential applications in the area of human computer interaction. Vision-based hand gesture recognition techniques have many proven advantages compared with traditional devices, giving users a simpler and more natural way to communicate with electronic devices. This work proposes a generic system architecture based in computer vision and machine learning, able to be used with any interface for human-computer interaction. The proposed solution is mainly composed of three modules: a pre-processing and hand segmentation module, a static gesture interface module and a dynamic gesture interface module. The experiments showed that the core of visionbased interaction systems could be the same for all applications and thus facilitate the implementation. For hand posture recognition, a SVM (Support Vector Machine) model was trained and used, able to achieve a final accuracy of 99.4%. For dynamic gestures, an HMM (Hidden Markov Model) model was trained for each gesture that the system could recognize with a final average accuracy of 93.7%. The proposed solution as the advantage of being generic enough with the trained models able to work in real-time, allowing its application in a wide range of human-machine applications. To validate the proposed framework two applications were implemented. The first one is a real-time system able to interpret the Portuguese Sign Language. The second one is an online system able to help a robotic soccer game referee judge a game in real time.
Resumo:
Vision-based hand gesture recognition is an area of active current research in computer vision and machine learning. Being a natural way of human interaction, it is an area where many researchers are working on, with the goal of making human computer interaction (HCI) easier and natural, without the need for any extra devices. So, the primary goal of gesture recognition research is to create systems, which can identify specific human gestures and use them, for example, to convey information. For that, vision-based hand gesture interfaces require fast and extremely robust hand detection, and gesture recognition in real time. Hand gestures are a powerful human communication modality with lots of potential applications and in this context we have sign language recognition, the communication method of deaf people. Sign lan- guages are not standard and universal and the grammars differ from country to coun- try. In this paper, a real-time system able to interpret the Portuguese Sign Language is presented and described. Experiments showed that the system was able to reliably recognize the vowels in real-time, with an accuracy of 99.4% with one dataset of fea- tures and an accuracy of 99.6% with a second dataset of features. Although the im- plemented solution was only trained to recognize the vowels, it is easily extended to recognize the rest of the alphabet, being a solid foundation for the development of any vision-based sign language recognition user interface system.
Resumo:
Hand gestures are a powerful way for human communication, with lots of potential applications in the area of human computer interaction. Vision-based hand gesture recognition techniques have many proven advantages compared with traditional devices, giving users a simpler and more natural way to communicate with electronic devices. This work proposes a generic system architecture based in computer vision and machine learning, able to be used with any interface for humancomputer interaction. The proposed solution is mainly composed of three modules: a pre-processing and hand segmentation module, a static gesture interface module and a dynamic gesture interface module. The experiments showed that the core of vision-based interaction systems can be the same for all applications and thus facilitate the implementation. In order to test the proposed solutions, three prototypes were implemented. For hand posture recognition, a SVM model was trained and used, able to achieve a final accuracy of 99.4%. For dynamic gestures, an HMM model was trained for each gesture that the system could recognize with a final average accuracy of 93.7%. The proposed solution as the advantage of being generic enough with the trained models able to work in real-time, allowing its application in a wide range of human-machine applications.
Resumo:
This work discusses the use of optical flow to generate the sensorial information a mobile robot needs to react to the presence of obstacles when navigating in a non-structured environment. A sensing system based on optical flow and time-to-collision calculation is here proposed and experimented, which accomplishes two important paradigms. The first one is that all computations are performed onboard the robot, in spite of the limited computational capability available. The second one is that the algorithms for optical flow and time-to-collision calculations are fast enough to give the mobile robot the capability of reacting to any environmental change in real-time. Results of real experiments in which the sensing system here proposed is used as the only source of sensorial data to guide a mobile robot to avoid obstacles while wandering around are presented, and the analysis of such results allows validating the proposed sensing system.
Resumo:
The Casa da Música Foundation, responsible for the management of Casa da Música do Porto building, has the need to obtain statistical data related to the number of building’s visitors. This information is a valuable tool for the elaboration of periodical reports concerning the success of this cultural institution. For this reason it was necessary to develop a system capable of returning the number of visitors for a requested period of time. This represents a complex task due to the building’s unique architectural design, characterized by very large doors and halls, and the sudden large number of people that pass through them in moments preceding and proceeding the different activities occurring in the building. To achieve the technical solution for this challenge, several image processing methods, for people detection with still cameras, were first studied. The next step was the development of a real time algorithm, using OpenCV libraries and computer vision concepts,to count individuals with the desired accuracy. This algorithm includes the scientific and technical knowledge acquired in the study of the previous methods. The themes developed in this thesis comprise the fields of background maintenance, shadow and highlight detection, and blob detection and tracking. A graphical interface was also built, to help on the development, test and tunning of the proposed system, as a complement to the work. Furthermore, tests to the system were also performed, to certify the proposed techniques against a set of limited circumstances. The results obtained revealed that the algorithm was successfully applied to count the number of people in complex environments with reliable accuracy.
Resumo:
The process of visually exploring underwater environments is still a complex problem. Underwater vision systems require complementary means of sensor information to help overcome water disturbances. This work proposes the development of calibration methods for a structured light based system consisting on a camera and a laser with a line beam. Two different calibration procedures that require only two images from different viewpoints were developed and tested in dry and underwater environments. Results obtained show, an accurate calibration for the camera/projector pair with errors close to 1 mm even in the presence of a small stereos baseline.
Resumo:
Nowadays, several sensors and mechanisms are available to estimate a mobile robot trajectory and location with respect to its surroundings. Usually absolute positioning mechanisms are the most accurate, but they also are the most expensive ones, and require pre installed equipment in the environment. Therefore, a system capable of measuring its motion and location within the environment (relative positioning) has been a research goal since the beginning of autonomous vehicles. With the increasing of the computational performance, computer vision has become faster and, therefore, became possible to incorporate it in a mobile robot. In visual odometry feature based approaches, the model estimation requires absence of feature association outliers for an accurate motion. Outliers rejection is a delicate process considering there is always a trade-off between speed and reliability of the system. This dissertation proposes an indoor 2D position system using Visual Odometry. The mobile robot has a camera pointed to the ceiling, for image analysis. As requirements, the ceiling and the oor (where the robot moves) must be planes. In the literature, RANSAC is a widely used method for outlier rejection. However, it might be slow in critical circumstances. Therefore, it is proposed a new algorithm that accelerates RANSAC, maintaining its reliability. The algorithm, called FMBF, consists on comparing image texture patterns between pictures, preserving the most similar ones. There are several types of comparisons, with different computational cost and reliability. FMBF manages those comparisons in order to optimize the trade-off between speed and reliability.
Resumo:
When underwater vehicles navigate close to the ocean floor, computer vision techniques can be applied to obtain motion estimates. A complete system to create visual mosaics of the seabed is described in this paper. Unfortunately, the accuracy of the constructed mosaic is difficult to evaluate. The use of a laboratory setup to obtain an accurate error measurement is proposed. The system consists on a robot arm carrying a downward looking camera. A pattern formed by a white background and a matrix of black dots uniformly distributed along the surveyed scene is used to find the exact image registration parameters. When the robot executes a trajectory (simulating the motion of a submersible), an image sequence is acquired by the camera. The estimated motion computed from the encoders of the robot is refined by detecting, to subpixel accuracy, the black dots of the image sequence, and computing the 2D projective transform which relates two consecutive images. The pattern is then substituted by a poster of the sea floor and the trajectory is executed again, acquiring the image sequence used to test the accuracy of the mosaicking system