974 resultados para Visual Tracking


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis presents a novel framework for state estimation in the context of robotic grasping and manipulation. The overall estimation approach is based on fusing various visual cues for manipulator tracking, namely appearance and feature-based, shape-based, and silhouette-based visual cues. Similarly, a framework is developed to fuse the above visual cues, but also kinesthetic cues such as force-torque and tactile measurements, for in-hand object pose estimation. The cues are extracted from multiple sensor modalities and are fused in a variety of Kalman filters.

A hybrid estimator is developed to estimate both a continuous state (robot and object states) and discrete states, called contact modes, which specify how each finger contacts a particular object surface. A static multiple model estimator is used to compute and maintain this mode probability. The thesis also develops an estimation framework for estimating model parameters associated with object grasping. Dual and joint state-parameter estimation is explored for parameter estimation of a grasped object's mass and center of mass. Experimental results demonstrate simultaneous object localization and center of mass estimation.

Dual-arm estimation is developed for two arm robotic manipulation tasks. Two types of filters are explored; the first is an augmented filter that contains both arms in the state vector while the second runs two filters in parallel, one for each arm. These two frameworks and their performance is compared in a dual-arm task of removing a wheel from a hub.

This thesis also presents a new method for action selection involving touch. This next best touch method selects an available action for interacting with an object that will gain the most information. The algorithm employs information theory to compute an information gain metric that is based on a probabilistic belief suitable for the task. An estimation framework is used to maintain this belief over time. Kinesthetic measurements such as contact and tactile measurements are used to update the state belief after every interactive action. Simulation and experimental results are demonstrated using next best touch for object localization, specifically a door handle on a door. The next best touch theory is extended for model parameter determination. Since many objects within a particular object category share the same rough shape, principle component analysis may be used to parametrize the object mesh models. These parameters can be estimated using the action selection technique that selects the touching action which best both localizes and estimates these parameters. Simulation results are then presented involving localizing and determining a parameter of a screwdriver.

Lastly, the next best touch theory is further extended to model classes. Instead of estimating parameters, object class determination is incorporated into the information gain metric calculation. The best touching action is selected in order to best discern between the possible model classes. Simulation results are presented to validate the theory.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The temporal structure of neuronal spike trains in the visual cortex can provide detailed information about the stimulus and about the neuronal implementation of visual processing. Spike trains recorded from the macaque motion area MT in previous studies (Newsome et al., 1989a; Britten et al., 1992; Zohary et al., 1994) are analyzed here in the context of the dynamic random dot stimulus which was used to evoke them. If the stimulus is incoherent, the spike trains can be highly modulated and precisely locked in time to the stimulus. In contrast, the coherent motion stimulus creates little or no temporal modulation and allows us to study patterns in the spike train that may be intrinsic to the cortical circuitry in area MT. Long gaps in the spike train evoked by the preferred direction motion stimulus are found, and they appear to be symmetrical to bursts in the response to the anti-preferred direction of motion. A novel cross-correlation technique is used to establish that the gaps are correlated between pairs of neurons. Temporal modulation is also found in psychophysical experiments using a modified stimulus. A model is made that can account for the temporal modulation in terms of the computational theory of biological image motion processing. A frequency domain analysis of the stimulus reveals that it contains a repeated power spectrum that may account for psychophysical and electrophysiological observations.

Some neurons tend to fire bursts of action potentials while others avoid burst firing. Using numerical and analytical models of spike trains as Poisson processes with the addition of refractory periods and bursting, we are able to account for peaks in the power spectrum near 40 Hz without assuming the existence of an underlying oscillatory signal. A preliminary examination of the local field potential reveals that stimulus-locked oscillation appears briefly at the beginning of the trial.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis presents a biologically plausible model of an attentional mechanism for forming position- and scale-invariant representations of objects in the visual world. The model relies on a set of control neurons to dynamically modify the synaptic strengths of intra-cortical connections so that information from a windowed region of primary visual cortex (Vl) is selectively routed to higher cortical areas. Local spatial relationships (i.e., topography) within the attentional window are preserved as information is routed through the cortex, thus enabling attended objects to be represented in higher cortical areas within an object-centered reference frame that is position and scale invariant. The representation in V1 is modeled as a multiscale stack of sample nodes with progressively lower resolution at higher eccentricities. Large changes in the size of the attentional window are accomplished by switching between different levels of the multiscale stack, while positional shifts and small changes in scale are accomplished by translating and rescaling the window within a single level of the stack. The control signals for setting the position and size of the attentional window are hypothesized to originate from neurons in the pulvinar and in the deep layers of visual cortex. The dynamics of these control neurons are governed by simple differential equations that can be realized by neurobiologically plausible circuits. In pre-attentive mode, the control neurons receive their input from a low-level "saliency map" representing potentially interesting regions of a scene. During the pattern recognition phase, control neurons are driven by the interaction between top-down (memory) and bottom-up (retinal input) sources. The model respects key neurophysiological, neuroanatomical, and psychophysical data relating to attention, and it makes a variety of experimentally testable predictions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Humans are able of distinguishing more than 5000 visual categories even in complex environments using a variety of different visual systems all working in tandem. We seem to be capable of distinguishing thousands of different odors as well. In the machine learning community, many commonly used multi-class classifiers do not scale well to such large numbers of categories. This thesis demonstrates a method of automatically creating application-specific taxonomies to aid in scaling classification algorithms to more than 100 cate- gories using both visual and olfactory data. The visual data consists of images collected online and pollen slides scanned under a microscope. The olfactory data was acquired by constructing a small portable sniffing apparatus which draws air over 10 carbon black polymer composite sensors. We investigate performance when classifying 256 visual categories, 8 or more species of pollen and 130 olfactory categories sampled from common household items and a standardized scratch-and-sniff test. Taxonomies are employed in a divide-and-conquer classification framework which improves classification time while allowing the end user to trade performance for specificity as needed. Before classification can even take place, the pollen counter and electronic nose must filter out a high volume of background “clutter” to detect the categories of interest. In the case of pollen this is done with an efficient cascade of classifiers that rule out most non-pollen before invoking slower multi-class classifiers. In the case of the electronic nose, much of the extraneous noise encountered in outdoor environments can be filtered using a sniffing strategy which preferentially samples the visensor response at frequencies that are relatively immune to background contributions from ambient water vapor. This combination of efficient background rejection with scalable classification algorithms is tested in detail for three separate projects: 1) the Caltech-256 Image Dataset, 2) the Caltech Automated Pollen Identification and Counting System (CAPICS) and 3) a portable electronic nose specially constructed for outdoor use.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Activity-dependent modulation of sensory systems has been documented in many organisms, and is likely to be essential for appropriate processing of information during different behavioral states. However, the mechanisms underlying these phenomena, and often their functional consequences, remain poorly characterized. I investigated the role of octopamine neurons in the flight-dependent modulation observed in visual interneurons in the fruit fly Drosophila melanogaster. The vertical system (VS) cells exhibit a boost in their response to visual motion during flight compared to quiescence. Pharmacological application of octopamine evokes responses in quiescent flies that mimic those observed during flight, and octopamine neurons that project to the optic lobes increase in activity during flight. Using genetic tools to manipulate the activity of octopamine neurons, I find that they are both necessary and sufficient for the flight-induced visual boost. This work provides the first evidence that endogenous release of octopamine is involved in state-dependent modulation of visual interneurons in flies. Further, I investigated the role of a single pair of octopamine neurons that project to the optic lobes, and found no evidence that chemical synaptic transmission via these neurons is necessary for the flight boost. However, I found some evidence that activation of these neurons may contribute to the flight boost. Wind stimuli alone are sufficient to generate transient increases in the VS cell response to motion vision, but result in no increase in baseline membrane potential. These results suggest that the flight boost originates not from a central command signal during flight, but from mechanosensory stimuli relayed via the octopamine system. Lastly, in an attempt to understand the functional consequences of the flight boost observed in visual interneurons, we measured the effect of inactivating octopamine neurons in freely flying flies. We found that flies whose octopamine neurons we silenced accelerate less than wild-type flies, consistent with the hypothesis that the flight boost we observe in VS cells is indicative of a gain control mechanism mediated by octopamine neurons. Together, this work serves as the basis for a mechanistic and functional understanding of octopaminergic modulation of vision in flying flies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In single-particle tracking (SPT), fluorescence video microscopy is used to record the motion images of single particle or single molecule. Here, by using a total-internal-reflection microscope equipped with an argon ion laser and a charge-coupled device (CCD) camera with high-speed and high-sensitivity, video images of single nanobeads in solutions were obtained. From the trajectories, the diffusion coefficient of individual nanobead was determined by the mean square displacements as a function of time. The sizes of nanobeads were calculated by Stokes-Einstein equation, and the results were compared with the actual values.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

O objetivo do presente trabalho foi avaliar in vivo a detecção de cárie através do exame visual ICDAS, transiluminação por fibra ótica combinado ao ICDAS e exame radiográfico. Um total de 2.279 superfícies proximais e cicatrículas e fissuras em incisivos superiores, pré-molares e molares permanentes e 272 superfícies em molares decíduos em72 escolares (8 a 18 anos) foram avaliadas por um examinador treinado. Os sete escores para detecção de cárie primária do sistema visual ICDAS foram aplicados. Dois equipamentos de transiluminação por fibra ótica foram avaliados: FOTI Schott (SCH), com ponta de fibra ótica com 0,5mm de diâmetro, e FOTI Microlux (MIC), com diâmetro da ponta 3 mm. Durante o exame combinado FOTI/ICDAS, a fibra ótica era utilizada tanto para iluminar quanto para transiluminar a superfície sob avaliação. O exame radiográfico (RX) consistiu de radiografias interproximais posteriores e periapicais anteriores. Os exames foram realizados em consultório odontológico após escovação supervisionada. No primeiro dia de exame, o exame visual utilizando o ICDAS era realizado e em seguida, o exame combinado ao MIC ou SCH. Logo após era realizado o exame radiográfico. Após uma semana, novamente o ICDAS era realizado, e em seguida o exame combinado com o equipamento de FOTI não utilizado na semana anterior. Os exames foram repetidos em 10 pacientes após intervalo mínimo de uma semana para avaliação da reprodutibilidade intra-examinador, a qual apresentou valores de 0,95 (ICDAS), 0,94 (MIC), 0,95 (SCH) e 0,99 (RX) pelo kappa ponderado. Em cicatrículas e fissuras de permanentes, o RX julgou que um número maior de superfícies apresentava lesão em dentina (53) do que os outros métodos (34 a 36); porém não detectou nenhuma lesão em esmalte, as quais foram identificadas pelo ICDAS (94), SCH (107) e MIC (91). Em proximais permanentes, a transiluminação por fibra ótica identificou maior número de proximais como lesão em esmalte - 150 (SCH) e 139 (MIC) - do que o exame visual (106), enquanto o RX identificou somente 43. Em oclusais de decíduos, os quatro métodos julgaram um número aproximadamente similar de superfícies sem lesão (52 a 59) ou com lesão em dentina (21 a 26), assim como para lesões proximais em dentina (31 a 36). Entretanto um número reduzido de lesões proximais decíduas em esmalte foi julgado pelo exame radiográfico (3) em comparação com os outros métodos (15 a 16). Em decíduos, o ICDAS e o FOTI combinado ao exame visual julgaram maior número de lesões proximais em esmalte que o exame radiográfico, sendo que número similar de lesões em dentina foram classificadas pelos quatro métodos em oclusais e proximais de molares decíduos. Em cicatrículas e fissuras de permanentes, tanto o exame visual ICDAS quanto sua combinação aos dois equipamentos de transiluminação apresentaram maior similaridade de superfícies julgadas como lesão em esmalte ou como lesão em dentina, enquanto o exame radiográfico classificou mais superfícies como lesão em dentina e nenhuma como lesão em esmalte. A adição da transiluminação por fibra ótica ao exame visual aumentou em um terço a detecção das lesões cariosas proximais julgadas em dentina pelo ICDAS isoladamente e aproximadamente quadruplicou o número daquelas assim classificadas pela avaliação radiográfica em permanentes.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A visual pattern recognition network and its training algorithm are proposed. The network constructed of a one-layer morphology network and a two-layer modified Hamming net. This visual network can implement invariant pattern recognition with respect to image translation and size projection. After supervised learning takes place, the visual network extracts image features and classifies patterns much the same as living beings do. Moreover we set up its optoelectronic architecture for real-time pattern recognition. (C) 1996 Optical Society of America