878 resultados para Depth Estimation,Deep Learning,Disparity Estimation,Computer Vision,Stereo Vision


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis proposes a solution to the problem of estimating the motion of an Unmanned Underwater Vehicle (UUV). Our approach is based on the integration of the incremental measurements which are provided by a vision system. When the vehicle is close to the underwater terrain, it constructs a visual map (so called "mosaic") of the area where the mission takes place while, at the same time, it localizes itself on this map, following the Concurrent Mapping and Localization strategy. The proposed methodology to achieve this goal is based on a feature-based mosaicking algorithm. A down-looking camera is attached to the underwater vehicle. As the vehicle moves, a sequence of images of the sea-floor is acquired by the camera. For every image of the sequence, a set of characteristic features is detected by means of a corner detector. Then, their correspondences are found in the next image of the sequence. Solving the correspondence problem in an accurate and reliable way is a difficult task in computer vision. We consider different alternatives to solve this problem by introducing a detailed analysis of the textural characteristics of the image. This is done in two phases: first comparing different texture operators individually, and next selecting those that best characterize the point/matching pair and using them together to obtain a more robust characterization. Various alternatives are also studied to merge the information provided by the individual texture operators. Finally, the best approach in terms of robustness and efficiency is proposed. After the correspondences have been solved, for every pair of consecutive images we obtain a list of image features in the first image and their matchings in the next frame. Our aim is now to recover the apparent motion of the camera from these features. Although an accurate texture analysis is devoted to the matching pro-cedure, some false matches (known as outliers) could still appear among the right correspon-dences. For this reason, a robust estimation technique is used to estimate the planar transformation (homography) which explains the dominant motion of the image. Next, this homography is used to warp the processed image to the common mosaic frame, constructing a composite image formed by every frame of the sequence. With the aim of estimating the position of the vehicle as the mosaic is being constructed, the 3D motion of the vehicle can be computed from the measurements obtained by a sonar altimeter and the incremental motion computed from the homography. Unfortunately, as the mosaic increases in size, image local alignment errors increase the inaccuracies associated to the position of the vehicle. Occasionally, the trajectory described by the vehicle may cross over itself. In this situation new information is available, and the system can readjust the position estimates. Our proposal consists not only in localizing the vehicle, but also in readjusting the trajectory described by the vehicle when crossover information is obtained. This is achieved by implementing an Augmented State Kalman Filter (ASKF). Kalman filtering appears as an adequate framework to deal with position estimates and their associated covariances. Finally, some experimental results are shown. A laboratory setup has been used to analyze and evaluate the accuracy of the mosaicking system. This setup enables a quantitative measurement of the accumulated errors of the mosaics created in the lab. Then, the results obtained from real sea trials using the URIS underwater vehicle are shown.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The human visual ability to perceive depth looks like a puzzle. We perceive three-dimensional spatial information quickly and efficiently by using the binocular stereopsis of our eyes and, what is mote important the learning of the most common objects which we achieved through living. Nowadays, modelling the behaviour of our brain is a fiction, that is why the huge problem of 3D perception and further, interpretation is split into a sequence of easier problems. A lot of research is involved in robot vision in order to obtain 3D information of the surrounded scene. Most of this research is based on modelling the stereopsis of humans by using two cameras as if they were two eyes. This method is known as stereo vision and has been widely studied in the past and is being studied at present, and a lot of work will be surely done in the future. This fact allows us to affirm that this topic is one of the most interesting ones in computer vision. The stereo vision principle is based on obtaining the three dimensional position of an object point from the position of its projective points in both camera image planes. However, before inferring 3D information, the mathematical models of both cameras have to be known. This step is known as camera calibration and is broadly describes in the thesis. Perhaps the most important problem in stereo vision is the determination of the pair of homologue points in the two images, known as the correspondence problem, and it is also one of the most difficult problems to be solved which is currently investigated by a lot of researchers. The epipolar geometry allows us to reduce the correspondence problem. An approach to the epipolar geometry is describes in the thesis. Nevertheless, it does not solve it at all as a lot of considerations have to be taken into account. As an example we have to consider points without correspondence due to a surface occlusion or simply due to a projection out of the camera scope. The interest of the thesis is focused on structured light which has been considered as one of the most frequently used techniques in order to reduce the problems related lo stereo vision. Structured light is based on the relationship between a projected light pattern its projection and an image sensor. The deformations between the pattern projected into the scene and the one captured by the camera, permits to obtain three dimensional information of the illuminated scene. This technique has been widely used in such applications as: 3D object reconstruction, robot navigation, quality control, and so on. Although the projection of regular patterns solve the problem of points without match, it does not solve the problem of multiple matching, which leads us to use hard computing algorithms in order to search the correct matches. In recent years, another structured light technique has increased in importance. This technique is based on the codification of the light projected on the scene in order to be used as a tool to obtain an unique match. Each token of light is imaged by the camera, we have to read the label (decode the pattern) in order to solve the correspondence problem. The advantages and disadvantages of stereo vision against structured light and a survey on coded structured light are related and discussed. The work carried out in the frame of this thesis has permitted to present a new coded structured light pattern which solves the correspondence problem uniquely and robust. Unique, as each token of light is coded by a different word which removes the problem of multiple matching. Robust, since the pattern has been coded using the position of each token of light with respect to both co-ordinate axis. Algorithms and experimental results are included in the thesis. The reader can see examples 3D measurement of static objects, and the more complicated measurement of moving objects. The technique can be used in both cases as the pattern is coded by a single projection shot. Then it can be used in several applications of robot vision. Our interest is focused on the mathematical study of the camera and pattern projector models. We are also interested in how these models can be obtained by calibration, and how they can be used to obtained three dimensional information from two correspondence points. Furthermore, we have studied structured light and coded structured light, and we have presented a new coded structured light pattern. However, in this thesis we started from the assumption that the correspondence points could be well-segmented from the captured image. Computer vision constitutes a huge problem and a lot of work is being done at all levels of human vision modelling, starting from a)image acquisition; b) further image enhancement, filtering and processing, c) image segmentation which involves thresholding, thinning, contour detection, texture and colour analysis, and so on. The interest of this thesis starts in the next step, usually known as depth perception or 3D measurement.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aquesta tesi està emmarcada dins la detecció precoç de masses, un dels símptomes més clars del càncer de mama, en imatges mamogràfiques. Primerament, s'ha fet un anàlisi extensiu dels diferents mètodes de la literatura, concloent que aquests mètodes són dependents de diferent paràmetres: el tamany i la forma de la massa i la densitat de la mama. Així, l'objectiu de la tesi és analitzar, dissenyar i implementar un mètode de detecció robust i independent d'aquests tres paràmetres. Per a tal fi, s'ha construït un patró deformable de la massa a partir de l'anàlisi de masses reals i, a continuació, aquest model és buscat en les imatges seguint un esquema probabilístic, obtenint una sèrie de regions sospitoses. Fent servir l'anàlisi 2DPCA, s'ha construït un algorisme capaç de discernir aquestes regions són realment una massa o no. La densitat de la mama és un paràmetre que s'introdueix de forma natural dins l'algorisme.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

La tesis se centra en la Visión por Computador y, más concretamente, en la segmentación de imágenes, la cual es una de las etapas básicas en el análisis de imágenes y consiste en la división de la imagen en un conjunto de regiones visualmente distintas y uniformes considerando su intensidad, color o textura. Se propone una estrategia basada en el uso complementario de la información de región y de frontera durante el proceso de segmentación, integración que permite paliar algunos de los problemas básicos de la segmentación tradicional. La información de frontera permite inicialmente identificar el número de regiones presentes en la imagen y colocar en el interior de cada una de ellas una semilla, con el objetivo de modelar estadísticamente las características de las regiones y definir de esta forma la información de región. Esta información, conjuntamente con la información de frontera, es utilizada en la definición de una función de energía que expresa las propiedades requeridas a la segmentación deseada: uniformidad en el interior de las regiones y contraste con las regiones vecinas en los límites. Un conjunto de regiones activas inician entonces su crecimiento, compitiendo por los píxeles de la imagen, con el objetivo de optimizar la función de energía o, en otras palabras, encontrar la segmentación que mejor se adecua a los requerimientos exprsados en dicha función. Finalmente, todo esta proceso ha sido considerado en una estructura piramidal, lo que nos permite refinar progresivamente el resultado de la segmentación y mejorar su coste computacional. La estrategia ha sido extendida al problema de segmentación de texturas, lo que implica algunas consideraciones básicas como el modelaje de las regiones a partir de un conjunto de características de textura y la extracción de la información de frontera cuando la textura es presente en la imagen. Finalmente, se ha llevado a cabo la extensión a la segmentación de imágenes teniendo en cuenta las propiedades de color y textura. En este sentido, el uso conjunto de técnicas no-paramétricas de estimación de la función de densidad para la descripción del color, y de características textuales basadas en la matriz de co-ocurrencia, ha sido propuesto para modelar adecuadamente y de forma completa las regiones de la imagen. La propuesta ha sido evaluada de forma objetiva y comparada con distintas técnicas de integración utilizando imágenes sintéticas. Además, se han incluido experimentos con imágenes reales con resultados muy positivos.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an enhanced hypothesis verification strategy for 3D object recognition. A new learning methodology is presented which integrates the traditional dichotomic object-centred and appearance-based representations in computer vision giving improved hypothesis verification under iconic matching. The "appearance" of a 3D object is learnt using an eigenspace representation obtained as it is tracked through a scene. The feature representation implicitly models the background and the objects observed enabling the segmentation of the objects from the background. The method is shown to enhance model-based tracking, particularly in the presence of clutter and occlusion, and to provide a basis for identification. The unified approach is discussed in the context of the traffic surveillance domain. The approach is demonstrated on real-world image sequences and compared to previous (edge-based) iconic evaluation techniques.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work proposes a method to determine the depth of objects in a scene using a combination between stereo vision and self-calibration techniques. Determining the rel- ative distance between visualized objects and a robot, with a stereo head, it is possible to navigate in unknown environments. Stereo vision techniques supply a depth measure by the combination of two or more images from the same scene. To achieve a depth estimates of the in scene objects a reconstruction of this scene geometry is necessary. For such reconstruction the relationship between the three-dimensional world coordi- nates and the two-dimensional images coordinates is necessary. Through the achievement of the cameras intrinsic parameters it is possible to make this coordinates systems relationship. These parameters can be gotten through geometric camera calibration, which, generally is made by a correlation between image characteristics of a calibration pattern with know dimensions. The cameras self-calibration allows the achievement of their intrinsic parameters without using a known calibration pattern, being possible their calculation and alteration during the displacement of the robot in an unknown environment. In this work a self-calibration method based in the three-dimensional polar coordinates to represent image features is presented. This representation is determined by the relationship between images features and horizontal and vertical opening cameras angles. Using the polar coordinates it is possible to geometrically reconstruct the scene. Through the proposed techniques combination it is possible to calculate a scene objects depth estimate, allowing the robot navigation in an unknown environment

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The camera motion estimation represents one of the fundamental problems in Computer Vision and it may be solved by several methods. Preemptive RANSAC is one of them, which in spite of its robustness and speed possesses a lack of flexibility related to the requirements of applications and hardware platforms using it. In this work, we propose an improvement to the structure of Preemptive RANSAC in order to overcome such limitations and make it feasible to execute on devices with heterogeneous resources (specially low budget systems) under tighter time and accuracy constraints. We derived a function called BRUMA from Preemptive RANSAC, which is able to generalize several preemption schemes, allowing previously fixed parameters (block size and elimination factor) to be changed according the applications constraints. We also propose the Generalized Preemptive RANSAC method, which allows to determine the maximum number of hipotheses an algorithm may generate. The experiments performed show the superiority of our method in the expected scenarios. Moreover, additional experiments show that the multimethod hypotheses generation achieved more robust results related to the variability in the set of evaluated motion directions

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Methods based on visual estimation still is the most widely used analysis of the distances that is covered by soccer players during matches, and most description available in the literature were obtained using such an approach. Recently, systems based on computer vision techniques have appeared and the very first results are available for comparisons. The aim of the present study was to analyse the distances covered by Brazilian soccer players and compare the results to the European players', both data measured by automatic tracking system. Four regular Brazilian First Division Championship matches between different teams were filmed. Applying a previously developed automatic tracking system (DVideo, Campinas, Brazil), the results of 55 outline players participated in the whole game (n = 55) are presented. The results of mean distances covered, standard deviations (s) and coefficient of variation (cv) after 90 minutes were 10,012 m, s = 1,024 m and cv = 10.2%, respectively. The results of three-way ANOVA according to playing positions, showed that the distances covered by external defender (10642 ± 663 m), central midfielders (10476 ± 702 m) and external midfielders (10598 ± 890 m) were greater than forwards (9612 ± 772 m) and forwards covered greater distances than central defenders (9029 ± 860 m). The greater distances were covered in standing, walking, or jogging, 5537 ± 263 m, followed by moderate-speed running, 1731 ± 399 m; low speed running, 1615 ± 351 m; high-speed running, 691 ± 190 m and sprinting, 437 ± 171 m. Mean distance covered in the first half was 5,173 m (s = 394 m, cv = 7.6%) highly significant greater (p < 0.001) than the mean value 4,808 m (s = 375 m, cv = 7.8%) in the second half. A minute-by-minute analysis revealed that after eight minutes of the second half, player performance has already decreased and this reduction is maintained throughout the second half. ©Journal of Sports Science and Medicine (2007).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Pós-graduação em Ciência da Computação - IBILCE

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Stereoscopic depth perception utilizes the disparity cues between the images that fall on the retinae of the two eyes. The purpose of this study was to determine what role aging and optical blur play in stereoscopic disparity sensitivity for real depth stimuli. Forty-six volunteers were tested ranging in age from 15 to 60 years. Crossed and uncrossed disparity thresholds were measured using white light under conditions of best optical correction. The uncrossed disparity thresholds were also measured with optical blur (from +1.0D to +5.0D added to the best correction). Stereothresholds were measured using the Frisby Stereo Test, which utilizes a four-alternative forced-choice staircase procedure. The threshold disparities measured for young adults were frequently lower than 10 arcsec, a value considerably lower than the clinical estimates commonly obtained using Random Dot Stereograms (20 arcsec) or Titmus Fly Test (40 arcsec) tests. Contrary to previous reports, disparity thresholds increased between the ages of 31 and 45 years. This finding should be taken into account in clinical evaluation of visual function of older patients. Optical blur degrades visual acuity and stereoacuity similarly under white-light conditions, indicating that both functions are affected proportionally by optical defocus.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Inspection for corrosion of gas storage spheres at the welding seam lines must be done periodically. Until now this inspection is being done manually and has a high cost associated to it and a high risk of inspection personel injuries. The Brazilian Petroleum Company, Petrobras, is seeking cost reduction and personel safety by the use of autonomous robot technology. This paper presents the development of a robot capable of autonomously follow a welding line and transporting corrosion measurement sensors. The robot uses a pair of sensors each composed of a laser source and a video camera that allows the estimation of the center of the welding line. The mechanical robot uses four magnetic wheels to adhere to the sphere's surface and was constructed in a way that always three wheels are in contact with the sphere's metallic surface which guarantees enough magnetic atraction to hold the robot in the sphere's surface all the time. Additionally, an independently actuated table for attaching the corrosion inspection sensors was included for small position corrections. Tests were conducted at the laboratory and in a real sphere showing the validity of the proposed approach and implementation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

[EN]In this paper, we experimentally study the combination of face and facial feature detectors to improve face detection performance. The face detection problem, as suggeted by recent face detection challenges, is still not solved. Face detectors traditionally fail in large-scale problems and/or when the face is occluded or di erent head rotations are present. The combination of face and facial feature detectors is evaluated with a public database. The obtained results evidence an improvement in the positive detection rate while reducing the false detection rate. Additionally, we prove that the integration of facial feature detectors provides useful information for pose estimation and face alignment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The quality of fish products is indispensably linked to the freshness of the raw material modulated by appropriate manipulation and storage conditions, specially the storage temperature after catch. The purpose of the research presented in this thesis, which was largely conducted in the context of a research project funded by Italian Ministry of Agricultural, Food and Forestry Policies (MIPAAF), concerned the evaluation of the freshness of farmed and wild fish species, in relation to different storage conditions, under ice (0°C) or at refrigeration temperature (4°C). Several specimens of different species, bogue (Boops boops), red mullet (Mullus barbatus), sea bream (Sparus aurata) and sea bass (Dicentrarchus labrax), during storage, under the different temperature conditions adopted, have been examined. The assessed control parameters were physical (texture, through the use of a dynamometer; visual quality using a computer vision system (CVS)), chemical (through footprint metabolomics 1H-NMR) and sensory (Quality Index Method (QIM). Microbiological determinations were also carried out on the species of hake (Merluccius merluccius). In general obtained results confirmed that the temperature of manipulation/conservation is a key factor in maintaining fish freshness. NMR spectroscopy showed to be able to quantify and evaluate the kinetics for unselected compounds during fish degradation, even a posteriori. This can be suitable for the development of new parameters related to quality and freshness. The development of physical methods, particularly the image analysis performed by computer vision system (CVS), for the evaluation of fish degradation, is very promising. Among CVS parameters, skin colour, presence and distribution of gill mucus, and eye shape modification evidenced a high sensibility for the estimation of fish quality loss, as a function of the adopted storage conditions. Particularly the eye concavity index detected on fish eye showed a high positive correlation with total QIM score.