985 resultados para Object vision


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Vision based tracking of an object using the ideas of perspective projection inherently consists of nonlinearly modelled measurements although the underlying dynamic system that encompasses the object and the vision sensors can be linear. Based on a necessary stereo vision setting, we introduce an appropriate measurement conversion techniques which subsequently facilitate using a linear filter. Linear filter together with the aforementioned measurement conversion approach conforms a robust linear filter that is based on the set values state estimation ideas; a particularly rich area in the robust control literature. We provide a rigorously theoretical analysis to ensure bounded state estimation errors formulated in terms of an ellipsoidal set in which the actual state is guaranteed to be included to an arbitrary high probability. Using computer simulations as well as a practical implementation consisting of a robotic manipulator, we demonstrate our linear robust filter significantly outperforms the traditionally used extended Kalman filter under this stereo vision scenario. © 2008 IEEE.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many vision problems deal with high-dimensional data, such as motion segmentation and face clustering. However, these high-dimensional data usually lie in a low-dimensional structure. Sparse representation is a powerful principle for solving a number of clustering problems with high-dimensional data. This principle is motivated from an ideal modeling of data points according to linear algebra theory. However, real data in computer vision are unlikely to follow the ideal model perfectly. In this paper, we exploit the mixed norm regularization for sparse subspace clustering. This regularization term is a convex combination of the l1norm, which promotes sparsity at the individual level and the block norm l2/1 which promotes group sparsity. Combining these powerful regularization terms will provide a more accurate modeling, subsequently leading to a better solution for the affinity matrix used in sparse subspace clustering. This could help us achieve better performance on motion segmentation and face clustering problems. This formulation also caters for different types of data corruptions. We derive a provably convergent algorithm based on the alternating direction method of multipliers (ADMM) framework, which is computationally efficient, to solve the formulation. We demonstrate that this formulation outperforms other state-of-arts on both motion segmentation and face clustering.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The problem of dynamic camera calibration considering moving objects in close range environments using straight lines as references is addressed. A mathematical model for the correspondence of a straight line in the object and image spaces is discussed. This model is based on the equivalence between the vector normal to the interpretation plane in the image space and the vector normal to the rotated interpretation plane in the object space. In order to solve the dynamic camera calibration, Kalman Filtering is applied; an iterative process based on the recursive property of the Kalman Filter is defined, using the sequentially estimated camera orientation parameters to feedback the feature extraction process in the image. For the dynamic case, e.g. an image sequence of a moving object, a state prediction and a covariance matrix for the next instant is obtained using the available estimates and the system model. Filtered state estimates can be computed from these predicted estimates using the Kalman Filtering approach and based on the system model parameters with good quality, for each instant of an image sequence. The proposed approach was tested with simulated and real data. Experiments with real data were carried out in a controlled environment, considering a sequence of images of a moving cube in a linear trajectory over a flat surface.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

[EN]The human face provides useful information during interaction; therefore, any system integrating Vision- BasedHuman Computer Interaction requires fast and reliable face and facial feature detection. Different approaches have focused on this ability but only open source implementations have been extensively used by researchers. A good example is the Viola–Jones object detection framework that particularly in the context of facial processing has been frequently used.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Introduction: Although it seems plausible that sports performance relies on high-acuity foveal vision, it could be empirically shown that myoptic blur (up to +2 diopters) does not harm performance in sport tasks that require foveal information pick-up like golf putting (Bulson, Ciuffreda, & Hung, 2008). How myoptic blur affects peripheral performance is yet unknown. Attention might be less needed for processing visual cues foveally and lead to better performance because peripheral cues are better processed as a function of reduced foveal vision, which will be tested in the current experiment. Methods: 18 sport science students with self-reported myopia volunteered as participants, all of them regularly wearing contact lenses. Exclusion criteria comprised visual correction other than myopic, correction of astigmatism and use of contact lenses out of Swiss delivery area. For each of the participants, three pairs of additional contact lenses (besides their regular lenses; used in the “plano” condition) were manufactured with an individual overcorrection to a retinal defocus of +1 to +3 diopters (referred to as “+1.00 D”, “+2.00 D”, and “+3.00 D” condition, respectively). Gaze data were acquired while participants had to perform a multiple object tracking (MOT) task that required to track 4 out of 10 moving stimuli. In addition, in 66.7 % of all trials, one of the 4 targets suddenly stopped during the motion phase for a period of 0.5 s. Stimuli moved in front of a picture of a sports hall to allow for foveal processing. Due to the directional hypotheses, the level of significance for one-tailed tests on differences was set at α = .05 and posteriori effect sizes were computed as partial eta squares (ηρ2). Results: Due to problems with the gaze-data collection, 3 participants had to be excluded from further analyses. The expectation of a centroid strategy was confirmed because gaze was closer to the centroid than the target (all p < .01). In comparison to the plano baseline, participants more often recalled all 4 targets under defocus conditions, F(1,14) = 26.13, p < .01, ηρ2 = .65. The three defocus conditions differed significantly, F(2,28) = 2.56, p = .05, ηρ2 = .16, with a higher accuracy as a function of a defocus increase and significant contrasts between conditions +1.00 D and +2.00 D (p = .03) and +1.00 D and +3.00 D (p = .03). For stop trials, significant differences could neither be found between plano baseline and defocus conditions, F(1,14) = .19, p = .67, ηρ2 = .01, nor between the three defocus conditions, F(2,28) = 1.09, p = .18, ηρ2 = .07. Participants reacted faster in “4 correct+button” trials under defocus than under plano-baseline conditions, F(1,14) = 10.77, p < .01, ηρ2 = .44. The defocus conditions differed significantly, F(2,28) = 6.16, p < .01, ηρ2 = .31, with shorter response times as a function of a defocus increase and significant contrasts between +1.00 D and +2.00 D (p = .01) and +1.00 D and +3.00 D (p < .01). Discussion: The results show that gaze behaviour in MOT is not affected to a relevant degree by a visual overcorrection up to +3 diopters. Hence, it can be taken for granted that peripheral event detection was investigated in the present study. This overcorrection, however, does not harm the capability to peripherally track objects. Moreover, if an event has to be detected peripherally, neither response accuracy nor response time is negatively affected. Findings could claim considerable relevance for all sport situations in which peripheral vision is required which now needs applied studies on this topic. References: Bulson, R. C., Ciuffreda, K. J., & Hung, G. K. (2008). The effect of retinal defocus on golf putting. Ophthalmic and Physiological Optics, 28, 334-344.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One of the most challenging problems that must be solved by any theoretical model purporting to explain the competence of the human brain for relational tasks is the one related with the analysis and representation of the internal structure in an extended spatial layout of múltiple objects. In this way, some of the problems are related with specific aims as how can we extract and represent spatial relationships among objects, how can we represent the movement of a selected object and so on. The main objective of this paper is the study of some plausible brain structures that can provide answers in these problems. Moreover, in order to achieve a more concrete knowledge, our study will be focused on the response of the retinal layers for optical information processing and how this information can be processed in the first cortex layers. The model to be reported is just a first trial and some major additions are needed to complete the whole vision process.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we apply a hierarchical tracking strategy of planar objects (or that can be assumed to be planar) that is based on direct methods for vision-based applications on-board UAVs. The use of this tracking strategy allows to achieve the tasks at real-time frame rates and to overcome problems posed by the challenging conditions of the tasks: e.g. constant vibrations, fast 3D changes, or limited capacity on-board. The vast majority of approaches make use of feature-based methods to track objects. Nonetheless, in this paper we show that although some of these feature-based solutions are faster, direct methods can be more robust under fast 3D motions (fast changes in position), some changes in appearance, constant vibrations (without requiring any specific hardware or software for video stabilization), and situations in which part of the object to track is outside of the field of view of the camera. The performance of the proposed tracking strategy on-board UAVs is evaluated with images from realflight tests using manually-generated ground truth information, accurate position estimation using a Vicon system, and also with simulated data from a simulation environment. Results show that the hierarchical tracking strategy performs better than wellknown feature-based algorithms and well-known configurations of direct methods, and that its performance is robust enough for vision-in-the-loop tasks, e.g. for vision-based landing tasks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Electronic devices endowed with camera platforms require new and powerful machine vision applications, which commonly include moving object detection strategies. To obtain high-quality results, the most recent strategies estimate nonparametrically background and foreground models and combine them by means of a Bayesian classifier. However, typical classifiers are limited by the use of constant prior values and they do not allow the inclusion of additional spatiodependent prior information. In this Letter, we propose an alternative Bayesian classifier that, unlike those reported before, allows the use of additional prior information obtained from any source and depending on the spatial position of each pixel.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A novel GPU-based nonparametric moving object detection strategy for computer vision tools requiring real-time processing is proposed. An alternative and efficient Bayesian classifier to combine nonparametric background and foreground models allows increasing correct detections while avoiding false detections. Additionally, an efficient region of interest analysis significantly reduces the computational cost of the detections.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

El principal objetivo de este trabajo es proporcionar una solución en tiempo real basada en visión estéreo o monocular precisa y robusta para que un vehículo aéreo no tripulado (UAV) sea autónomo en varios tipos de aplicaciones UAV, especialmente en entornos abarrotados sin señal GPS. Este trabajo principalmente consiste en tres temas de investigación de UAV basados en técnicas de visión por computador: (I) visual tracking, proporciona soluciones efectivas para localizar visualmente objetos de interés estáticos o en movimiento durante el tiempo que dura el vuelo del UAV mediante una aproximación adaptativa online y una estrategia de múltiple resolución, de este modo superamos los problemas generados por las diferentes situaciones desafiantes, tales como cambios significativos de aspecto, iluminación del entorno variante, fondo del tracking embarullado, oclusión parcial o total de objetos, variaciones rápidas de posición y vibraciones mecánicas a bordo. La solución ha sido utilizada en aterrizajes autónomos, inspección de plataformas mar adentro o tracking de aviones en pleno vuelo para su detección y evasión; (II) odometría visual: proporciona una solución eficiente al UAV para estimar la posición con 6 grados de libertad (6D) usando únicamente la entrada de una cámara estéreo a bordo del UAV. Un método Semi-Global Blocking Matching (SGBM) eficiente basado en una estrategia grueso-a-fino ha sido implementada para una rápida y profunda estimación del plano. Además, la solución toma provecho eficazmente de la información 2D y 3D para estimar la posición 6D, resolviendo de esta manera la limitación de un punto de referencia fijo en la cámara estéreo. Una robusta aproximación volumétrica de mapping basada en el framework Octomap ha sido utilizada para reconstruir entornos cerrados y al aire libre bastante abarrotados en 3D con memoria y errores correlacionados espacialmente o temporalmente; (III) visual control, ofrece soluciones de control prácticas para la navegación de un UAV usando Fuzzy Logic Controller (FLC) con la estimación visual. Y el framework de Cross-Entropy Optimization (CEO) ha sido usado para optimizar el factor de escala y la función de pertenencia en FLC. Todas las soluciones basadas en visión en este trabajo han sido probadas en test reales. Y los conjuntos de datos de imágenes reales grabados en estos test o disponibles para la comunidad pública han sido utilizados para evaluar el rendimiento de estas soluciones basadas en visión con ground truth. Además, las soluciones de visión presentadas han sido comparadas con algoritmos de visión del estado del arte. Los test reales y los resultados de evaluación muestran que las soluciones basadas en visión proporcionadas han obtenido rendimientos en tiempo real precisos y robustos, o han alcanzado un mejor rendimiento que aquellos algoritmos del estado del arte. La estimación basada en visión ha ganado un rol muy importante en controlar un UAV típico para alcanzar autonomía en aplicaciones UAV. ABSTRACT The main objective of this dissertation is providing real-time accurate robust monocular or stereo vision-based solution for Unmanned Aerial Vehicle (UAV) to achieve the autonomy in various types of UAV applications, especially in GPS-denied dynamic cluttered environments. This dissertation mainly consists of three UAV research topics based on computer vision technique: (I) visual tracking, it supplys effective solutions to visually locate interesting static or moving object over time during UAV flight with on-line adaptivity approach and multiple-resolution strategy, thereby overcoming the problems generated by the different challenging situations, such as significant appearance change, variant surrounding illumination, cluttered tracking background, partial or full object occlusion, rapid pose variation and onboard mechanical vibration. The solutions have been utilized in autonomous landing, offshore floating platform inspection and midair aircraft tracking for sense-and-avoid; (II) visual odometry: it provides the efficient solution for UAV to estimate the 6 Degree-of-freedom (6D) pose using only the input of stereo camera onboard UAV. An efficient Semi-Global Blocking Matching (SGBM) method based on a coarse-to-fine strategy has been implemented for fast depth map estimation. In addition, the solution effectively takes advantage of both 2D and 3D information to estimate the 6D pose, thereby solving the limitation of a fixed small baseline in the stereo camera. A robust volumetric occupancy mapping approach based on the Octomap framework has been utilized to reconstruct indoor and outdoor large-scale cluttered environments in 3D with less temporally or spatially correlated measurement errors and memory; (III) visual control, it offers practical control solutions to navigate UAV using Fuzzy Logic Controller (FLC) with the visual estimation. And the Cross-Entropy Optimization (CEO) framework has been used to optimize the scaling factor and the membership function in FLC. All the vision-based solutions in this dissertation have been tested in real tests. And the real image datasets recorded from these tests or available from public community have been utilized to evaluate the performance of these vision-based solutions with ground truth. Additionally, the presented vision solutions have compared with the state-of-art visual algorithms. Real tests and evaluation results show that the provided vision-based solutions have obtained real-time accurate robust performances, or gained better performance than those state-of-art visual algorithms. The vision-based estimation has played a critically important role for controlling a typical UAV to achieve autonomy in the UAV application.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Efficient and reliable classification of visual stimuli requires that their representations reside a low-dimensional and, therefore, computationally manageable feature space. We investigated the ability of the human visual system to derive such representations from the sensory input-a highly nontrivial task, given the million or so dimensions of the visual signal at its entry point to the cortex. In a series of experiments, subjects were presented with sets of parametrically defined shapes; the points in the common high-dimensional parameter space corresponding to the individual shapes formed regular planar (two-dimensional) patterns such as a triangle, a square, etc. We then used multidimensional scaling to arrange the shapes in planar configurations, dictated by their experimentally determined perceived similarities. The resulting configurations closely resembled the original arrangements of the stimuli in the parameter space. This achievement of the human visual system was replicated by a computational model derived from a theory of object representation in the brain, according to which similarities between objects, and not the geometry of each object, need to be faithfully represented.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Sensing techniques are important for solving problems of uncertainty inherent to intelligent grasping tasks. The main goal here is to present a visual sensing system based on range imaging technology for robot manipulation of non-rigid objects. Our proposal provides a suitable visual perception system of complex grasping tasks to support a robot controller when other sensor systems, such as tactile and force, are not able to obtain useful data relevant to the grasping manipulation task. In particular, a new visual approach based on RGBD data was implemented to help a robot controller carry out intelligent manipulation tasks with flexible objects. The proposed method supervises the interaction between the grasped object and the robot hand in order to avoid poor contact between the fingertips and an object when there is neither force nor pressure data. This new approach is also used to measure changes to the shape of an object’s surfaces and so allows us to find deformations caused by inappropriate pressure being applied by the hand’s fingers. Test was carried out for grasping tasks involving several flexible household objects with a multi-fingered robot hand working in real time. Our approach generates pulses from the deformation detection method and sends an event message to the robot controller when surface deformation is detected. In comparison with other methods, the obtained results reveal that our visual pipeline does not use deformations models of objects and materials, as well as the approach works well both planar and 3D household objects in real time. In addition, our method does not depend on the pose of the robot hand because the location of the reference system is computed from a recognition process of a pattern located place at the robot forearm. The presented experiments demonstrate that the proposed method accomplishes a good monitoring of grasping task with several objects and different grasping configurations in indoor environments.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes the real time global vision system for the robot soccer team the RoboRoos. It has a highly optimised pipeline that includes thresholding, segmenting, colour normalising, object recognition and perspective and lens correction. It has a fast ‘paint’ colour calibration system that can calibrate in any face of the YUV or HSI cube. It also autonomously selects both an appropriate camera gain and colour gains robot regions across the field to achieve colour uniformity. Camera geometry calibration is performed automatically from selection of keypoints on the field. The system acheives a position accuracy of better than 15mm over a 4m × 5.5m field, and orientation accuracy to within 1°. It processes 614 × 480 pixels at 60Hz on a 2.0GHz Pentium 4 microprocessor.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis deals with the challenging problem of designing systems able to perceive objects in underwater environments. In the last few decades research activities in robotics have advanced the state of art regarding intervention capabilities of autonomous systems. State of art in fields such as localization and navigation, real time perception and cognition, safe action and manipulation capabilities, applied to ground environments (both indoor and outdoor) has now reached such a readiness level that it allows high level autonomous operations. On the opposite side, the underwater environment remains a very difficult one for autonomous robots. Water influences the mechanical and electrical design of systems, interferes with sensors by limiting their capabilities, heavily impacts on data transmissions, and generally requires systems with low power consumption in order to enable reasonable mission duration. Interest in underwater applications is driven by needs of exploring and intervening in environments in which human capabilities are very limited. Nowadays, most underwater field operations are carried out by manned or remotely operated vehicles, deployed for explorations and limited intervention missions. Manned vehicles, directly on-board controlled, expose human operators to risks related to the stay in field of the mission, within a hostile environment. Remotely Operated Vehicles (ROV) currently represent the most advanced technology for underwater intervention services available on the market. These vehicles can be remotely operated for long time but they need support from an oceanographic vessel with multiple teams of highly specialized pilots. Vehicles equipped with multiple state-of-art sensors and capable to autonomously plan missions have been deployed in the last ten years and exploited as observers for underwater fauna, seabed, ship wrecks, and so on. On the other hand, underwater operations like object recovery and equipment maintenance are still challenging tasks to be conducted without human supervision since they require object perception and localization with much higher accuracy and robustness, to a degree seldom available in Autonomous Underwater Vehicles (AUV). This thesis reports the study, from design to deployment and evaluation, of a general purpose and configurable platform dedicated to stereo-vision perception in underwater environments. Several aspects related to the peculiar environment characteristics have been taken into account during all stages of system design and evaluation: depth of operation and light conditions, together with water turbidity and external weather, heavily impact on perception capabilities. The vision platform proposed in this work is a modular system comprising off-the-shelf components for both the imaging sensors and the computational unit, linked by a high performance ethernet network bus. The adopted design philosophy aims at achieving high flexibility in terms of feasible perception applications, that should not be as limited as in case of a special-purpose and dedicated hardware. Flexibility is required by the variability of underwater environments, with water conditions ranging from clear to turbid, light backscattering varying with daylight and depth, strong color distortion, and other environmental factors. Furthermore, the proposed modular design ensures an easier maintenance and update of the system over time. Performance of the proposed system, in terms of perception capabilities, has been evaluated in several underwater contexts taking advantage of the opportunity offered by the MARIS national project. Design issues like energy power consumption, heat dissipation and network capabilities have been evaluated in different scenarios. Finally, real-world experiments, conducted in multiple and variable underwater contexts, including open sea waters, have led to the collection of several datasets that have been publicly released to the scientific community. The vision system has been integrated in a state of the art AUV equipped with a robotic arm and gripper, and has been exploited in the robot control loop to successfully perform underwater grasping operations.