939 resultados para visual object detection
Resumo:
Aquesta tesi tracta sobre la combinació del control visual i la llum estructurada. El control visual clàssic assumeix que elements visuals poden ser fàcilment extrets de les imatges. Això fa que objectes d'aspecte uniforme o poc texturats no es puguin tenir en compte. En aquesta tesi proposem l'ús de la llum estructurada per dotar d'elements visuals als objectes independentment de la seva aparença. En primer lloc, es presenta un ampli estudi de la llum estructurada, el qual ens permet proposar un nou patró codificat que millora els existents. La resta de la tesi es concentra en el posicionament d'un robot dotat d'una càmara respecte diferents objectes, utilitzant la informació proveïda per la projecció de diferents patrons de llum. Dos configuracions han estat estudiades: quan el projector de llum es troba separat del robot, i quan el projector està embarcat en el robot juntament amb la càmara. Les tècniques proposades en la tesi estan avalades per un ampli estudi analític i validades per resultats experimentals.
Resumo:
The human visual ability to perceive depth looks like a puzzle. We perceive three-dimensional spatial information quickly and efficiently by using the binocular stereopsis of our eyes and, what is mote important the learning of the most common objects which we achieved through living. Nowadays, modelling the behaviour of our brain is a fiction, that is why the huge problem of 3D perception and further, interpretation is split into a sequence of easier problems. A lot of research is involved in robot vision in order to obtain 3D information of the surrounded scene. Most of this research is based on modelling the stereopsis of humans by using two cameras as if they were two eyes. This method is known as stereo vision and has been widely studied in the past and is being studied at present, and a lot of work will be surely done in the future. This fact allows us to affirm that this topic is one of the most interesting ones in computer vision. The stereo vision principle is based on obtaining the three dimensional position of an object point from the position of its projective points in both camera image planes. However, before inferring 3D information, the mathematical models of both cameras have to be known. This step is known as camera calibration and is broadly describes in the thesis. Perhaps the most important problem in stereo vision is the determination of the pair of homologue points in the two images, known as the correspondence problem, and it is also one of the most difficult problems to be solved which is currently investigated by a lot of researchers. The epipolar geometry allows us to reduce the correspondence problem. An approach to the epipolar geometry is describes in the thesis. Nevertheless, it does not solve it at all as a lot of considerations have to be taken into account. As an example we have to consider points without correspondence due to a surface occlusion or simply due to a projection out of the camera scope. The interest of the thesis is focused on structured light which has been considered as one of the most frequently used techniques in order to reduce the problems related lo stereo vision. Structured light is based on the relationship between a projected light pattern its projection and an image sensor. The deformations between the pattern projected into the scene and the one captured by the camera, permits to obtain three dimensional information of the illuminated scene. This technique has been widely used in such applications as: 3D object reconstruction, robot navigation, quality control, and so on. Although the projection of regular patterns solve the problem of points without match, it does not solve the problem of multiple matching, which leads us to use hard computing algorithms in order to search the correct matches. In recent years, another structured light technique has increased in importance. This technique is based on the codification of the light projected on the scene in order to be used as a tool to obtain an unique match. Each token of light is imaged by the camera, we have to read the label (decode the pattern) in order to solve the correspondence problem. The advantages and disadvantages of stereo vision against structured light and a survey on coded structured light are related and discussed. The work carried out in the frame of this thesis has permitted to present a new coded structured light pattern which solves the correspondence problem uniquely and robust. Unique, as each token of light is coded by a different word which removes the problem of multiple matching. Robust, since the pattern has been coded using the position of each token of light with respect to both co-ordinate axis. Algorithms and experimental results are included in the thesis. The reader can see examples 3D measurement of static objects, and the more complicated measurement of moving objects. The technique can be used in both cases as the pattern is coded by a single projection shot. Then it can be used in several applications of robot vision. Our interest is focused on the mathematical study of the camera and pattern projector models. We are also interested in how these models can be obtained by calibration, and how they can be used to obtained three dimensional information from two correspondence points. Furthermore, we have studied structured light and coded structured light, and we have presented a new coded structured light pattern. However, in this thesis we started from the assumption that the correspondence points could be well-segmented from the captured image. Computer vision constitutes a huge problem and a lot of work is being done at all levels of human vision modelling, starting from a)image acquisition; b) further image enhancement, filtering and processing, c) image segmentation which involves thresholding, thinning, contour detection, texture and colour analysis, and so on. The interest of this thesis starts in the next step, usually known as depth perception or 3D measurement.
Resumo:
"Exhibiting is or should be to work against ignorance, especially against the most refractory of all ignorance: the pre-conceived idea of stereo typed culture. To exhibit is to take a calculated risk of disorientation - in the etymological sense : ( to lose your bearings), disturbs the harmony, the evident , and the consensus, that constitutes the common place ( the banal). Needless to say however it is obvious that an exhibition that deliberately tries to scandalise will create an inverted perversion which results in an obscurantist pseudo-luxury - culture ... between demagogy and provocation, one has to find visual communication's subtle itinerary. Even though an intermediary route is not so stimulating : as Gaston Bachelard said "All the roads lead to Rome, except the roads of compromise."
Resumo:
Exhibiting is or should be to work against ignorance, especially against the most refractory of all ignorance: the pre-conceived idea of stereo typed culture. To exhibit is to take a calculated risk of disorientation - in the etymological sense: (to lose your bearings), disturbs the harmony, the evident , and the consensus, that constitutes the common place (the banal). Needless to say however it is obvious that an exhibition that deliberately tries to scandalise will create an inverted perversion which results in an obscurantist pseudo-luxury - culture ... between demagogy and provocation, one has to find visual communication's subtle itinerary. Even though an intermediary route is not so stimulating: as Gaston Bachelard said "All the roads lead to Rome, except the roads of compromise." It is becoming ever more evident that museums have undergone changes that are noticeable in numerous areas. As well as the traditional functions of collecting, conserving and exhibiting objects. museums have tried to become a means of communication, open and aware of the worries of modern society. In order to do this , it has started to utilise modern technology now available and lead by the hand of "marketing" and modern business management.
Resumo:
Esta investigação teve como objeto de estudo o Ateliê Caderneta de Cromos, do projeto Geração Cool, fruto de uma parceria entre várias instituições do concelho de Almada. Trata--se de um projeto de desenvolvimento social comunitário, multicultural, associado a uma escola. Através de um estudo de caso, pretendeu analisar-se, criticamente, um modelo não formal de práticas ligadas às artes visuais, vocacionado para jovens em risco e mostrar como as aprendizagens desenvolvidas num ateliê de artes visuais contribuem para o processo de inclusão desses jovens. Desenvolveu-se um quadro teórico abrangente, no sentido de sustentar as perguntas iniciais, referenciando questões consideradas pertinentes tais como: visões contemporâneas das realidades multiculturais das periferias urbanas; perspetivas pós-modernistas de ensino artístico, defensoras de uma construção cognitiva; ação dos projetos artísticos de desenvolvimento social e ainda, a importância ética e social dos currículos artísticos atuais. Tendo como referência a hipótese de antagonismo e/ou complementaridade entre o ato pedagógico não formal e o institucional, o estudo procurou estabelecer uma relação entre essa prática e a inclusão, ao identificar e desmontar um roteiro estratégico de aprendizagens. O ato pedagógico no Ateliê mostrou potenciar uma aprendizagem construtiva nas respostas produzidas, sendo também significativa pelo caráter experiencial vivido e cognitiva no sentido em que determina a construção de um significado, assumido como a própria assunção identitária.
Resumo:
This paper reports the current state of work to simplify our previous model-based methods for visual tracking of vehicles for use in a real-time system intended to provide continuous monitoring and classification of traffic from a fixed camera on a busy multi-lane motorway. The main constraints of the system design were: (i) all low level processing to be carried out by low-cost auxiliary hardware, (ii) all 3-D reasoning to be carried out automatically off-line, at set-up time. The system developed uses three main stages: (i) pose and model hypothesis using 1-D templates, (ii) hypothesis tracking, and (iii) hypothesis verification, using 2-D templates. Stages (i) & (iii) have radically different computing performance and computational costs, and need to be carefully balanced for efficiency. Together, they provide an effective way to locate, track and classify vehicles.
Resumo:
Recent work has suggested that for some tasks, graphical displays which visually integrate information from more than one source offer an advantage over more traditional displays which present the same information in a separated format. Three experiments are described which investigate this claim using a task which requires subjects to control a dynamic system. In the first experiment, the integrated display is compared to two separated displays, one an animated mimic diagram, the other an alphanumeric display. The integrated display is shown to support better performance in a control task, but experiment 2 shows that part of this advantage may be due to its analogue nature. Experiment 3 considers performance on a fault detection task, and shows no difference between the integrated and separated displays. The paper concludes that previous claims made for integrated displays may not generalize from monitoring to control tasks.
Resumo:
The coding of body part location may depend upon both visual and proprioceptive information, and allows targets to be localized with respect to the body. The present study investigates the interaction between visual and proprioceptive localization systems under conditions of multisensory conflict induced by optokinetic stimulation (OKS). Healthy subjects were asked to estimate the apparent motion speed of a visual target (LED) that could be located either in the extrapersonal space (visual encoding only, V), or at the same distance, but stuck on the subject's right index finger-tip (visual and proprioceptive encoding, V-P). Additionally, the multisensory condition was performed with the index finger kept in position both passively (V-P passive) and actively (V-P active). Results showed that the visual stimulus was always perceived to move, irrespective of its out- or on-the-body location. Moreover, this apparent motion speed varied consistently with the speed of the moving OKS background in all conditions. Surprisingly, no differences were found between V-P active and V-P passive conditions in the speed of apparent motion. The persistence of the visual illusion during the active posture maintenance reveals a novel condition in which vision totally dominates over proprioceptive information, suggesting that the hand-held visual stimulus was perceived as a purely visual, external object despite its contact with the hand.
Resumo:
The authors assessed rats' encoding of the appearance or egocentric position of objects within visual scenes containing 3 objects (Experiment 1) or I object (Experiment 2A). Experiment 2B assessed encoding of the shape and fill pattern of single objects, and encoding of configurations (object + position, shape + fill). All were assessed by testing rats' ability to discriminate changes from familiar scenes (constant-negative paradigm). Perirhinal cortex lesions impaired encoding of objects and their shape; postrhinal cortex lesions impaired encoding of egocentric position, but the effect may have been partly due to entorhinal involvement. Neither lesioned group was impaired in detecting configural change. In Experiment 1, both lesion groups were impaired in detecting small changes in relative position of the 3 objects, suggesting that more sensitive tests might reveal configural encoding deficits.
Resumo:
Perirhinal cortex in monkeys has been thought to be involved in visual associative learning. The authors examined rats' ability to make associations between visual stimuli in a visual secondary reinforcement task. Rats learned 2-choice visual discriminations for secondary visual reinforcement. They showed significant learning of discriminations before any primary reinforcement. Following bilateral perirhinal cortex lesions, rats continued to learn visual discriminations for visual secondary reinforcement at the same rate as before surgery. Thus, this study does not support a critical role of perirhinal cortex in learning for visual secondary reinforcement. Contrasting this result with other positive results, the authors suggest that the role of perirhinal cortex is in "within-object" associations and that it plays a much lesser role in stimulus-stimulus associations between objects.
Resumo:
Between 8 and 40% of Parkinson disease (PD) patients will have visual hallucinations (VHs) during the course of their illness. Although cognitive impairment has been identified as a risk factor for hallucinations, more specific neuropsychological deficits underlying such phenomena have not been established. Research in psychopathology has converged to suggest that hallucinations are associated with confusion between internal representations of events and real events (i.e. impaired-source monitoring). We evaluated three groups: 17 Parkinson's patients with visual hallucinations, 20 Parkinson's patients without hallucinations and 20 age-matched controls, using tests of visual imagery, visual perception and memory, including tests of source monitoring and recollective experience. The study revealed that Parkinson's patients with hallucinations appear to have intact visual imagery processes and spatial perception. However, there were impairments in object perception and recognition memory, and poor recollection of the encoding episode in comparison to both non-hallucinating Parkinson's patients and healthy controls. Errors were especially likely to occur when encoding and retrieval cues were in different modalities. The findings raise the possibility that visual hallucinations in Parkinson's patients could stem from a combination of faulty perceptual processing of environmental stimuli, and less detailed recollection of experience combined with intact image generation. (C) 2002 Elsevier Science Ltd. All fights reserved.
Resumo:
In the past decade, airborne based LIght Detection And Ranging (LIDAR) has been recognised by both the commercial and public sectors as a reliable and accurate source for land surveying in environmental, engineering and civil applications. Commonly, the first task to investigate LIDAR point clouds is to separate ground and object points. Skewness Balancing has been proven to be an efficient non-parametric unsupervised classification algorithm to address this challenge. Initially developed for moderate terrain, this algorithm needs to be adapted to handle sloped terrain. This paper addresses the difficulty of object and ground point separation in LIDAR data in hilly terrain. A case study on a diverse LIDAR data set in terms of data provider, resolution and LIDAR echo has been carried out. Several sites in urban and rural areas with man-made structure and vegetation in moderate and hilly terrain have been investigated and three categories have been identified. A deeper investigation on an urban scene with a river bank has been selected to extend the existing algorithm. The results show that an iterative use of Skewness Balancing is suitable for sloped terrain.
Resumo:
Light Detection And Ranging (LIDAR) is an important modality in terrain and land surveying for many environmental, engineering and civil applications. This paper presents the framework for a recently developed unsupervised classification algorithm called Skewness Balancing for object and ground point separation in airborne LIDAR data. The main advantages of the algorithm are threshold-freedom and independence from LIDAR data format and resolution, while preserving object and terrain details. The framework for Skewness Balancing has been built in this contribution with a prediction model in which unknown LIDAR tiles can be categorised as “hilly” or “moderate” terrains. Accuracy assessment of the model is carried out using cross-validation with an overall accuracy of 95%. An extension to the algorithm is developed to address the overclassification issue for hilly terrain. For moderate terrain, the results show that from the classified tiles detached objects (buildings and vegetation) and attached objects (bridges and motorway junctions) are separated from bare earth (ground, roads and yards) which makes Skewness Balancing ideal to be integrated into geographic information system (GIS) software packages.
Resumo:
This paper describes the novel use of agent and cellular neural Hopfield network techniques in the design of a self-contained, object detecting retina. The agents, which are used to detect features within an image, are trained using the Hebbian method which has been modified for the cellular architecture. The success of each agent is communicated with adjacent agents in order to verify the detection of an object. Initial work used the method to process bipolar images. This has now been extended to handle grey scale images. Simulations have demonstrated the success of the method and further work is planned in which the device is to be implemented in hardware.