117 resultados para Android Computervision Computer Vision Sift HSV
Resumo:
We present a georeferenced photomosaic of the Lucky Strike hydrothermal vent field (Mid-Atlantic Ridge, 37°18’N). The photomosaic was generated from digital photographs acquired using the ARGO II seafloor imaging system during the 1996 LUSTRE cruise, which surveyed a ~1 km2 zone and provided a coverage of ~20% of the seafloor. The photomosaic has a pixel resolution of 15 mm and encloses the areas with known active hydrothermal venting. The final mosaic is generated after an optimization that includes the automatic detection of the same benthic features across different images (feature-matching), followed by a global alignment of images based on the vehicle navigation. We also provide software to construct mosaics from large sets of images for which georeferencing information exists (location, attitude, and altitude per image), to visualize them, and to extract data. Georeferencing information can be provided by the raw navigation data (collected during the survey) or result from the optimization obtained from imatge matching. Mosaics based solely on navigation can be readily generated by any user but the optimization and global alignment of the mosaic requires a case-by-case approach for which no universally software is available. The Lucky Strike photomosaics (optimized and navigated-only) are publicly available through the Marine Geoscience Data System (MGDS, http://www.marine-geo.org). The mosaic-generating and viewing software is available through the Computer Vision and Robotics Group Web page at the University of Girona (http://eia.udg.es/_rafa/mosaicviewer.html)
Resumo:
The relief of the seafloor is an important source of data for many scientists. In this paper we present an optical system to deal with underwater 3D reconstruction. This system is formed by three cameras that take images synchronously in a constant frame rate scheme. We use the images taken by these cameras to compute dense 3D reconstructions. We use Bundle Adjustment to estimate the motion ofthe trinocular rig. Given the path followed by the system, we get a dense map of the observed scene by registering the different dense local reconstructions in a unique and bigger one
Resumo:
In this paper, we present a method to deal with the constraints of the underwater medium for finding changes between sequences of underwater images. One of the main problems of underwater medium for automatically detecting changes is the low altitude of the camera when taking pictures. This emphasise the parallax effect between the images as they are not taken exactly at the same position. In order to solve this problem, we are geometrically registering the images together taking into account the relief of the scene
Resumo:
Evaluating other individuals with respect to personality characteristics plays a crucial role in human relations and it is the focus of attention for research in diverse fields such as psychology and interactive computer systems. In psychology, face perception has been recognized as a key component of this evaluation system. Multiple studies suggest that observers use face information to infer personality characteristics. Interactive computer systems are trying to take advantage of these findings and apply them to increase the natural aspect of interaction and to improve the performance of interactive computer systems. Here, we experimentally test whether the automatic prediction of facial trait judgments (e.g. dominance) can be made by using the full appearance information of the face and whether a reduced representation of its structure is sufficient. We evaluate two separate approaches: a holistic representation model using the facial appearance information and a structural model constructed from the relations among facial salient points. State of the art machine learning methods are applied to a) derive a facial trait judgment model from training data and b) predict a facial trait value for any face. Furthermore, we address the issue of whether there are specific structural relations among facial points that predict perception of facial traits. Experimental results over a set of labeled data (9 different trait evaluations) and classification rules (4 rules) suggest that a) prediction of perception of facial traits is learnable by both holistic and structural approaches; b) the most reliable prediction of facial trait judgments is obtained by certain type of holistic descriptions of the face appearance; and c) for some traits such as attractiveness and extroversion, there are relationships between specific structural features and social perceptions.
Resumo:
This research extends a previously developed work concerning about the use of local model predictive control in mobile robots. Hence, experimental results are presented as a way to improve the methodology by considering aspects as trajectory accuracy and time performance. In this sense, the cost function and the prediction horizon are important aspects to be considered. The platformused is a differential driven robot with a free rotating wheel. The aim of the present work is to test the control method by measuring trajectory tracking accuracy and time performance. Moreover, strategies for the integration with perception system and path planning are also introduced. In this sense, monocular image data provide an occupancy grid where safety trajectories are computed by using goal attraction potential fields
Resumo:
Treball final de carrera basat en el reconeixement de punts clau en imatges mitjançant l'algorisme Random Ferns.
Resumo:
This paper proposes an automatic hand detection system that combines the Fourier-Mellin Transform along with other computer vision techniques to achieve hand detection in cluttered scene color images. The proposed system uses the Fourier-Mellin Transform as an invariant feature extractor to perform RST invariant hand detection. In a first stage of the system a simple non-adaptive skin color-based image segmentation and an interest point detector based on corners are used in order to identify regions of interest that contains possible matches. A sliding window algorithm is then used to scan the image at different scales performing the FMT calculations only in the previously detected regions of interest and comparing the extracted FM descriptor of the windows with a hand descriptors database obtained from a train image set. The results of the performed experiments suggest the use of Fourier-Mellin invariant features as a promising approach for automatic hand detection.
Resumo:
La segmentació de persones es molt difícil a causa de la variabilitat de les diferents condicions, com la postura que aquestes adoptin, color del fons, etc. Per realitzar aquesta segmentació existeixen diferents tècniques, que a partir d'una imatge ens retornen un etiquetat indicant els diferents objectes presents a la imatge. El propòsit d'aquest projecte és realitzar una comparativa de les tècniques recents que permeten fer segmentació multietiqueta i que son semiautomàtiques, en termes de segmentació de persones. A partir d'un etiquetatge inicial idèntic per a tots els mètodes utilitzats, s'ha realitzat una anàlisi d'aquests, avaluant els seus resultats sobre unes dades publiques, analitzant 2 punts: el nivell de interacció i l'eficiència.
Resumo:
This paper proposes an automatic hand detection system that combines the Fourier-Mellin Transform along with other computer vision techniques to achieve hand detection in cluttered scene color images. The proposed system uses the Fourier-Mellin Transform as an invariant feature extractor to perform RST invariant hand detection. In a first stage of the system a simple non-adaptive skin color-based image segmentation and an interest point detector based on corners are used in order to identify regions of interest that contains possible matches. A sliding window algorithm is then used to scan the image at different scales performing the FMT calculations only in the previously detected regions of interest and comparing the extracted FM descriptor of the windows with a hand descriptors database obtained from a train image set. The results of the performed experiments suggest the use of Fourier-Mellin invariant features as a promising approach for automatic hand detection.
Resumo:
El principal objectiu d’aquest projecte és aconseguir classificar diferents vídeos d’esports segons la seva categoria. Els cercadors de text creen un vocabulari segons el significat de les diferents paraules per tal de poder identificar un document. En aquest projecte es va fer el mateix però mitjançant paraules visuals. Per exemple, es van intentar englobar com a una única paraula les diferents rodes que apareixien en els cotxes de rally. A partir de la freqüència amb què apareixien les paraules dels diferents grups dins d’una imatge vàrem crear histogrames de vocabulari que ens permetien tenir una descripció de la imatge. Per classificar un vídeo es van utilitzar els histogrames que descrivien els seus fotogrames. Com que cada histograma es podia considerar un vector de valors enters vàrem optar per utilitzar una màquina classificadora de vectors: una Support vector machine o SVM
Resumo:
In robotics, having a 3D representation of the environment where a robot is working can be very useful. In real-life scenarios, this environment is constantly changing for example by human interaction, external agents or by the robot itself. Thus, the representation needs to be constantly updated and extended to account for these dynamic scene changes. In this work we face the problem of representing the scene where a robot is acting. Moreover, we ought to improve this representation by reusing the information obtained in previous scenes. Our goal is to build a method to represent a scene and to update it while changes are produced. In order to achieve that, different aspects of computer vision such as space representation or feature tracking are discussed
Resumo:
Psychophysical studies suggest that humans preferentially use a narrow band of low spatial frequencies for face recognition. Here we asked whether artificial face recognition systems have an improved recognition performance at the same spatial frequencies as humans. To this end, we estimated recognition performance over a large database of face images by computing three discriminability measures: Fisher Linear Discriminant Analysis, Non-Parametric Discriminant Analysis, and Mutual Information. In order to address frequency dependence, discriminabilities were measured as a function of (filtered) image size. All three measures revealed a maximum at the same image sizes, where the spatial frequency content corresponds to the psychophysical found frequencies. Our results therefore support the notion that the critical band of spatial frequencies for face recognition in humans and machines follows from inherent properties of face images, and that the use of these frequencies is associated with optimal face recognition performance.
Resumo:
Aquest projecte s’emmarca dins de l’àmbit de la visió per computador, concretament en la utilització de dades de profunditat obtingudes a través d’un emissor i sensor de llum infraroja.El propòsit principal d’aquest projecte és mostrar com adaptar aquestes tecnologies, a l’abast de qualsevol particular, de forma que un usuari durant la pràctica d’una activitat esportiva concreta, rebi informació visual continua dels moviments i gestos incorrectes que està realitzant, en base a uns paràmetres prèviament establerts.L’objectiu d’aquest projecte consisteix en fer una lectura constant en temps real d’una persona practicant una selecció de diverses activitats esportives estàtiques utilitzant un sensor Kinect. A través de les dades obtingudes pel sensor Kinect i utilitzant les llibreries de “skeleton traking” proporcionades per Microsoft s’haurà d’interpretar les dades posturals obtingudes per cada tipus d’esport i indicar visualment i d’una manera intuïtiva els errors que està cometent en temps real, de manera que es vegi clarament a quina part del seu cos realitza un moviment incorrecte per tal de poder corregir-lo ràpidament. El entorn de desenvolupament que s’utilitza per desenvolupar aquesta aplicació es Microsoft Viusal Studio 2010.El llenguatge amb el qual es treballarà sobre Microsoft Visual Studio 2010 és C#
Resumo:
Aquest projecte s'ha desenvolupat dins de l'àrea de visió per computadors, mitjançant el reconeixement d'un patró podem definir tres eixos que conformen un espai tridimensional on hem implementat un videojoc de combats entre robots a sobre d'un entorn real.
Resumo:
L'objectiu principal d'aquest treball és aplicar tècniques de visió articial per aconseguir localitzar i fer el seguiment de les extremitats dels ratolins dins l'entorn de prova de les investigacions d'optogenètica del grup de recerca del Neuroscience Institute de la Universitat de Princeton, Nova Jersey.