841 resultados para visual object detection
Resumo:
A new method for the automated selection of colour features is described. The algorithm consists of two stages of processing. In the first, a complete set of colour features is calculated for every object of interest in an image. In the second stage, each object is mapped into several n-dimensional feature spaces in order to select the feature set with the smallest variables able to discriminate the remaining objects. The evaluation of the discrimination power for each concrete subset of features is performed by means of decision trees composed of linear discrimination functions. This method can provide valuable help in outdoor scene analysis where no colour space has been demonstrated as being the most suitable. Experiment results recognizing objects in outdoor scenes are reported
Resumo:
Positioning a robot with respect to objects by using data provided by a camera is a well known technique called visual servoing. In order to perform a task, the object must exhibit visual features which can be extracted from different points of view. Then, visual servoing is object-dependent as it depends on the object appearance. Therefore, performing the positioning task is not possible in presence of nontextured objets or objets for which extracting visual features is too complex or too costly. This paper proposes a solution to tackle this limitation inherent to the current visual servoing techniques. Our proposal is based on the coded structured light approach as a reliable and fast way to solve the correspondence problem. In this case, a coded light pattern is projected providing robust visual features independently of the object appearance
Resumo:
Positioning a robot with respect to objects by using data provided by a camera is a well known technique called visual servoing. In order to perform a task, the object must exhibit visual features which can be extracted from different points of view. Then, visual servoing is object-dependent as it depends on the object appearance. Therefore, performing the positioning task is not possible in presence of non-textured objects or objects for which extracting visual features is too complex or too costly. This paper proposes a solution to tackle this limitation inherent to the current visual servoing techniques. Our proposal is based on the coded structured light approach as a reliable and fast way to solve the correspondence problem. In this case, a coded light pattern is projected providing robust visual features independently of the object appearance
Resumo:
This paper focuses on the problem of realizing a plane-to-plane virtual link between a camera attached to the end-effector of a robot and a planar object. In order to do the system independent to the object surface appearance, a structured light emitter is linked to the camera so that 4 laser pointers are projected onto the object. In a previous paper we showed that such a system has good performance and nice characteristics like partial decoupling near the desired state and robustness against misalignment of the emitter and the camera (J. Pages et al., 2004). However, no analytical results concerning the global asymptotic stability of the system were obtained due to the high complexity of the visual features utilized. In this work we present a better set of visual features which improves the properties of the features in (J. Pages et al., 2004) and for which it is possible to prove the global asymptotic stability
Resumo:
In this paper we face the problem of positioning a camera attached to the end-effector of a robotic manipulator so that it gets parallel to a planar object. Such problem has been treated for a long time in visual servoing. Our approach is based on linking to the camera several laser pointers so that its configuration is aimed to produce a suitable set of visual features. The aim of using structured light is not only for easing the image processing and to allow low-textured objects to be treated, but also for producing a control scheme with nice properties like decoupling, stability, well conditioning and good camera trajectory
Resumo:
Aquesta tesi tracta sobre la combinació del control visual i la llum estructurada. El control visual clàssic assumeix que elements visuals poden ser fàcilment extrets de les imatges. Això fa que objectes d'aspecte uniforme o poc texturats no es puguin tenir en compte. En aquesta tesi proposem l'ús de la llum estructurada per dotar d'elements visuals als objectes independentment de la seva aparença. En primer lloc, es presenta un ampli estudi de la llum estructurada, el qual ens permet proposar un nou patró codificat que millora els existents. La resta de la tesi es concentra en el posicionament d'un robot dotat d'una càmara respecte diferents objectes, utilitzant la informació proveïda per la projecció de diferents patrons de llum. Dos configuracions han estat estudiades: quan el projector de llum es troba separat del robot, i quan el projector està embarcat en el robot juntament amb la càmara. Les tècniques proposades en la tesi estan avalades per un ampli estudi analític i validades per resultats experimentals.
Resumo:
The human visual ability to perceive depth looks like a puzzle. We perceive three-dimensional spatial information quickly and efficiently by using the binocular stereopsis of our eyes and, what is mote important the learning of the most common objects which we achieved through living. Nowadays, modelling the behaviour of our brain is a fiction, that is why the huge problem of 3D perception and further, interpretation is split into a sequence of easier problems. A lot of research is involved in robot vision in order to obtain 3D information of the surrounded scene. Most of this research is based on modelling the stereopsis of humans by using two cameras as if they were two eyes. This method is known as stereo vision and has been widely studied in the past and is being studied at present, and a lot of work will be surely done in the future. This fact allows us to affirm that this topic is one of the most interesting ones in computer vision. The stereo vision principle is based on obtaining the three dimensional position of an object point from the position of its projective points in both camera image planes. However, before inferring 3D information, the mathematical models of both cameras have to be known. This step is known as camera calibration and is broadly describes in the thesis. Perhaps the most important problem in stereo vision is the determination of the pair of homologue points in the two images, known as the correspondence problem, and it is also one of the most difficult problems to be solved which is currently investigated by a lot of researchers. The epipolar geometry allows us to reduce the correspondence problem. An approach to the epipolar geometry is describes in the thesis. Nevertheless, it does not solve it at all as a lot of considerations have to be taken into account. As an example we have to consider points without correspondence due to a surface occlusion or simply due to a projection out of the camera scope. The interest of the thesis is focused on structured light which has been considered as one of the most frequently used techniques in order to reduce the problems related lo stereo vision. Structured light is based on the relationship between a projected light pattern its projection and an image sensor. The deformations between the pattern projected into the scene and the one captured by the camera, permits to obtain three dimensional information of the illuminated scene. This technique has been widely used in such applications as: 3D object reconstruction, robot navigation, quality control, and so on. Although the projection of regular patterns solve the problem of points without match, it does not solve the problem of multiple matching, which leads us to use hard computing algorithms in order to search the correct matches. In recent years, another structured light technique has increased in importance. This technique is based on the codification of the light projected on the scene in order to be used as a tool to obtain an unique match. Each token of light is imaged by the camera, we have to read the label (decode the pattern) in order to solve the correspondence problem. The advantages and disadvantages of stereo vision against structured light and a survey on coded structured light are related and discussed. The work carried out in the frame of this thesis has permitted to present a new coded structured light pattern which solves the correspondence problem uniquely and robust. Unique, as each token of light is coded by a different word which removes the problem of multiple matching. Robust, since the pattern has been coded using the position of each token of light with respect to both co-ordinate axis. Algorithms and experimental results are included in the thesis. The reader can see examples 3D measurement of static objects, and the more complicated measurement of moving objects. The technique can be used in both cases as the pattern is coded by a single projection shot. Then it can be used in several applications of robot vision. Our interest is focused on the mathematical study of the camera and pattern projector models. We are also interested in how these models can be obtained by calibration, and how they can be used to obtained three dimensional information from two correspondence points. Furthermore, we have studied structured light and coded structured light, and we have presented a new coded structured light pattern. However, in this thesis we started from the assumption that the correspondence points could be well-segmented from the captured image. Computer vision constitutes a huge problem and a lot of work is being done at all levels of human vision modelling, starting from a)image acquisition; b) further image enhancement, filtering and processing, c) image segmentation which involves thresholding, thinning, contour detection, texture and colour analysis, and so on. The interest of this thesis starts in the next step, usually known as depth perception or 3D measurement.
Resumo:
"Exhibiting is or should be to work against ignorance, especially against the most refractory of all ignorance: the pre-conceived idea of stereo typed culture. To exhibit is to take a calculated risk of disorientation - in the etymological sense : ( to lose your bearings), disturbs the harmony, the evident , and the consensus, that constitutes the common place ( the banal). Needless to say however it is obvious that an exhibition that deliberately tries to scandalise will create an inverted perversion which results in an obscurantist pseudo-luxury - culture ... between demagogy and provocation, one has to find visual communication's subtle itinerary. Even though an intermediary route is not so stimulating : as Gaston Bachelard said "All the roads lead to Rome, except the roads of compromise."
Resumo:
Exhibiting is or should be to work against ignorance, especially against the most refractory of all ignorance: the pre-conceived idea of stereo typed culture. To exhibit is to take a calculated risk of disorientation - in the etymological sense: (to lose your bearings), disturbs the harmony, the evident , and the consensus, that constitutes the common place (the banal). Needless to say however it is obvious that an exhibition that deliberately tries to scandalise will create an inverted perversion which results in an obscurantist pseudo-luxury - culture ... between demagogy and provocation, one has to find visual communication's subtle itinerary. Even though an intermediary route is not so stimulating: as Gaston Bachelard said "All the roads lead to Rome, except the roads of compromise." It is becoming ever more evident that museums have undergone changes that are noticeable in numerous areas. As well as the traditional functions of collecting, conserving and exhibiting objects. museums have tried to become a means of communication, open and aware of the worries of modern society. In order to do this , it has started to utilise modern technology now available and lead by the hand of "marketing" and modern business management.
Resumo:
Esta investigação teve como objeto de estudo o Ateliê Caderneta de Cromos, do projeto Geração Cool, fruto de uma parceria entre várias instituições do concelho de Almada. Trata--se de um projeto de desenvolvimento social comunitário, multicultural, associado a uma escola. Através de um estudo de caso, pretendeu analisar-se, criticamente, um modelo não formal de práticas ligadas às artes visuais, vocacionado para jovens em risco e mostrar como as aprendizagens desenvolvidas num ateliê de artes visuais contribuem para o processo de inclusão desses jovens. Desenvolveu-se um quadro teórico abrangente, no sentido de sustentar as perguntas iniciais, referenciando questões consideradas pertinentes tais como: visões contemporâneas das realidades multiculturais das periferias urbanas; perspetivas pós-modernistas de ensino artístico, defensoras de uma construção cognitiva; ação dos projetos artísticos de desenvolvimento social e ainda, a importância ética e social dos currículos artísticos atuais. Tendo como referência a hipótese de antagonismo e/ou complementaridade entre o ato pedagógico não formal e o institucional, o estudo procurou estabelecer uma relação entre essa prática e a inclusão, ao identificar e desmontar um roteiro estratégico de aprendizagens. O ato pedagógico no Ateliê mostrou potenciar uma aprendizagem construtiva nas respostas produzidas, sendo também significativa pelo caráter experiencial vivido e cognitiva no sentido em que determina a construção de um significado, assumido como a própria assunção identitária.
Resumo:
This paper reports the current state of work to simplify our previous model-based methods for visual tracking of vehicles for use in a real-time system intended to provide continuous monitoring and classification of traffic from a fixed camera on a busy multi-lane motorway. The main constraints of the system design were: (i) all low level processing to be carried out by low-cost auxiliary hardware, (ii) all 3-D reasoning to be carried out automatically off-line, at set-up time. The system developed uses three main stages: (i) pose and model hypothesis using 1-D templates, (ii) hypothesis tracking, and (iii) hypothesis verification, using 2-D templates. Stages (i) & (iii) have radically different computing performance and computational costs, and need to be carefully balanced for efficiency. Together, they provide an effective way to locate, track and classify vehicles.
Resumo:
Recent work has suggested that for some tasks, graphical displays which visually integrate information from more than one source offer an advantage over more traditional displays which present the same information in a separated format. Three experiments are described which investigate this claim using a task which requires subjects to control a dynamic system. In the first experiment, the integrated display is compared to two separated displays, one an animated mimic diagram, the other an alphanumeric display. The integrated display is shown to support better performance in a control task, but experiment 2 shows that part of this advantage may be due to its analogue nature. Experiment 3 considers performance on a fault detection task, and shows no difference between the integrated and separated displays. The paper concludes that previous claims made for integrated displays may not generalize from monitoring to control tasks.
Resumo:
The coding of body part location may depend upon both visual and proprioceptive information, and allows targets to be localized with respect to the body. The present study investigates the interaction between visual and proprioceptive localization systems under conditions of multisensory conflict induced by optokinetic stimulation (OKS). Healthy subjects were asked to estimate the apparent motion speed of a visual target (LED) that could be located either in the extrapersonal space (visual encoding only, V), or at the same distance, but stuck on the subject's right index finger-tip (visual and proprioceptive encoding, V-P). Additionally, the multisensory condition was performed with the index finger kept in position both passively (V-P passive) and actively (V-P active). Results showed that the visual stimulus was always perceived to move, irrespective of its out- or on-the-body location. Moreover, this apparent motion speed varied consistently with the speed of the moving OKS background in all conditions. Surprisingly, no differences were found between V-P active and V-P passive conditions in the speed of apparent motion. The persistence of the visual illusion during the active posture maintenance reveals a novel condition in which vision totally dominates over proprioceptive information, suggesting that the hand-held visual stimulus was perceived as a purely visual, external object despite its contact with the hand.
Resumo:
The authors assessed rats' encoding of the appearance or egocentric position of objects within visual scenes containing 3 objects (Experiment 1) or I object (Experiment 2A). Experiment 2B assessed encoding of the shape and fill pattern of single objects, and encoding of configurations (object + position, shape + fill). All were assessed by testing rats' ability to discriminate changes from familiar scenes (constant-negative paradigm). Perirhinal cortex lesions impaired encoding of objects and their shape; postrhinal cortex lesions impaired encoding of egocentric position, but the effect may have been partly due to entorhinal involvement. Neither lesioned group was impaired in detecting configural change. In Experiment 1, both lesion groups were impaired in detecting small changes in relative position of the 3 objects, suggesting that more sensitive tests might reveal configural encoding deficits.
Resumo:
Perirhinal cortex in monkeys has been thought to be involved in visual associative learning. The authors examined rats' ability to make associations between visual stimuli in a visual secondary reinforcement task. Rats learned 2-choice visual discriminations for secondary visual reinforcement. They showed significant learning of discriminations before any primary reinforcement. Following bilateral perirhinal cortex lesions, rats continued to learn visual discriminations for visual secondary reinforcement at the same rate as before surgery. Thus, this study does not support a critical role of perirhinal cortex in learning for visual secondary reinforcement. Contrasting this result with other positive results, the authors suggest that the role of perirhinal cortex is in "within-object" associations and that it plays a much lesser role in stimulus-stimulus associations between objects.