995 resultados para Visual input
                                
Resumo:
We present a statistical image-based shape + structure model for Bayesian visual hull reconstruction and 3D structure inference. The 3D shape of a class of objects is represented by sets of contours from silhouette views simultaneously observed from multiple calibrated cameras. Bayesian reconstructions of new shapes are then estimated using a prior density constructed with a mixture model and probabilistic principal components analysis. We show how the use of a class-specific prior in a visual hull reconstruction can reduce the effect of segmentation errors from the silhouette extraction process. The proposed method is applied to a data set of pedestrian images, and improvements in the approximate 3D models under various noise conditions are shown. We further augment the shape model to incorporate structural features of interest; unknown structural parameters for a novel set of contours are then inferred via the Bayesian reconstruction process. Model matching and parameter inference are done entirely in the image domain and require no explicit 3D construction. Our shape model enables accurate estimation of structure despite segmentation errors or missing views in the input silhouettes, and works even with only a single input view. Using a data set of thousands of pedestrian images generated from a synthetic model, we can accurately infer the 3D locations of 19 joints on the body based on observed silhouette contours from real images.
                                
Resumo:
Stimuli outside classical receptive fields have been shown to exert significant influence over the activities of neurons in primary visual cortexWe propose that contextual influences are used for pre-attentive visual segmentation, in a new framework called segmentation without classification. This means that segmentation of an image into regions occurs without classification of features within a region or comparison of features between regions. This segmentation framework is simpler than previous computational approaches, making it implementable by V1 mechanisms, though higher leve l visual mechanisms are needed to refine its output. However, it easily handles a class of segmentation problems that are tricky in conventional methods. The cortex computes global region boundaries by detecting the breakdown of homogeneity or translation invariance in the input, using local intra-cortical interactions mediated by the horizontal connections. The difference between contextual influences near and far from region boundaries makes neural activities near region boundaries higher than elsewhere, making boundaries more salient for perceptual pop-out. This proposal is implemented in a biologically based model of V1, and demonstrated using examples of texture segmentation and figure-ground segregation. The model performs segmentation in exactly the same neural circuit that solves the dual problem of the enhancement of contours, as is suggested by experimental observations. Its behavior is compared with psychophysical and physiological data on segmentation, contour enhancement, and contextual influences. We discuss the implications of segmentation without classification and the predictions of our V1 model, and relate it to other phenomena such as asymmetry in visual search.
                                
Resumo:
Stimuli outside classical receptive fields significantly influence the neurons' activities in primary visual cortex. We propose that such contextual influences are used to segment regions by detecting the breakdown of homogeneity or translation invariance in the input, thus computing global region boundaries using local interactions. This is implemented in a biologically based model of V1, and demonstrated in examples of texture segmentation and figure-ground segregation. By contrast with traditional approaches, segmentation occurs without classification or comparison of features within or between regions and is performed by exactly the same neural circuit responsible for the dual problem of the grouping and enhancement of contours.
                                
Resumo:
We present MikeTalk, a text-to-audiovisual speech synthesizer which converts input text into an audiovisual speech stream. MikeTalk is built using visemes, which are a small set of images spanning a large range of mouth shapes. The visemes are acquired from a recorded visual corpus of a human subject which is specifically designed to elicit one instantiation of each viseme. Using optical flow methods, correspondence from every viseme to every other viseme is computed automatically. By morphing along this correspondence, a smooth transition between viseme images may be generated. A complete visual utterance is constructed by concatenating viseme transitions. Finally, phoneme and timing information extracted from a text-to-speech synthesizer is exploited to determine which viseme transitions to use, and the rate at which the morphing process should occur. In this manner, we are able to synchronize the visual speech stream with the audio speech stream, and hence give the impression of a photorealistic talking face.
                                
Resumo:
Memoria de m??ster (Universidad Francisco de Nebrija, 2013)
                                
Resumo:
Visual exploration of scientific data in life science area is a growing research field due to the large amount of available data. The Kohonen’s Self Organizing Map (SOM) is a widely used tool for visualization of multidimensional data. In this paper we present a fast learning algorithm for SOMs that uses a simulated annealing method to adapt the learning parameters. The algorithm has been adopted in a data analysis framework for the generation of similarity maps. Such maps provide an effective tool for the visual exploration of large and multi-dimensional input spaces. The approach has been applied to data generated during the High Throughput Screening of molecular compounds; the generated maps allow a visual exploration of molecules with similar topological properties. The experimental analysis on real world data from the National Cancer Institute shows the speed up of the proposed SOM training process in comparison to a traditional approach. The resulting visual landscape groups molecules with similar chemical properties in densely connected regions.
                                
Resumo:
In immediate recall tasks, visual recency is substantially enhanced when output interference is low (Cowan, Saults, Elliott, & Moreno, 2002; Craik, 1969) whereas auditory recency remains high even under conditions of high output interference. Ibis auditory advantage has been interpreted in terms of auditory resistance to output interference (e.g., Neath & Surprenant, 2003). In this study the auditory-visual difference at low output interference re-emerged when ceiling effects were accounted for, but only with spoken output. With written responding the auditory advantage remained significantly larger with high than with low output interference. These new data suggest that both superior auditory encoding and modality-specific output interference contribute to the classic auditory-visual modality effect.
                                
Resumo:
Rats with fornix transection, or with cytotoxic retrohippocampal lesions that removed entorhinal cortex plus ventral subiculum, performed a task that permits incidental learning about either allocentric (Allo) or egocentric (Ego) spatial cues without the need to navigate by them. Rats learned eight visual discriminations among computer-displayed scenes in a Y-maze, using the constant-negative paradigm. Every discrimination problem included two familiar scenes (constants) and many less familiar scenes (variables). On each trial, the rats chose between a constant and a variable scene, with the choice of the variable rewarded. In six problems, the two constant scenes had correlated spatial properties, either Alto (each constant appeared always in the same maze arm) or Ego (each constant always appeared in a fixed direction from the start arm) or both (Allo + Ego). In two No-Cue (NC) problems, the two constants appeared in randomly determined arms and directions. Intact rats learn problems with an added Allo or Ego cue faster than NC problems; this facilitation provides indirect evidence that they learn the associations between scenes and spatial cues, even though that is not required for problem solution. Fornix and retrohippocampal-lesioned groups learned NC problems at a similar rate to sham-operated controls and showed as much facilitation of learning by added spatial cues as did the controls; therefore, both lesion groups must have encoded the spatial cues and have incidentally learned their associations with particular constant scenes. Similar facilitation was seen in subgroups that had short or long prior experience with the apparatus and task. Therefore, neither major hippocampal input-output system is crucial for learning about allocentric or egocentric cues in this paradigm, which does not require rats to control their choices or navigation directly by spatial cues.
                                
Resumo:
Spontaneous activity of the brain at rest frequently has been considered a mere backdrop to the salient activity evoked by external stimuli or tasks. However, the resting state of the brain consumes most of its energy budget, which suggests a far more important role. An intriguing hint comes from experimental observations of spontaneous activity patterns, which closely resemble those evoked by visual stimulation with oriented gratings, except that cortex appeared to cycle between different orientation maps. Moreover, patterns similar to those evoked by the behaviorally most relevant horizontal and vertical orientations occurred more often than those corresponding to oblique angles. We hypothesize that this kind of spontaneous activity develops at least to some degree autonomously, providing a dynamical reservoir of cortical states, which are then associated with visual stimuli through learning. To test this hypothesis, we use a biologically inspired neural mass model to simulate a patch of cat visual cortex. Spontaneous transitions between orientation states were induced by modest modifications of the neural connectivity, establishing a stable heteroclinic channel. Significantly, the experimentally observed greater frequency of states representing the behaviorally important horizontal and vertical orientations emerged spontaneously from these simulations. We then applied bar-shaped inputs to the model cortex and used Hebbian learning rules to modify the corresponding synaptic strengths. After unsupervised learning, different bar inputs reliably and exclusively evoked their associated orientation state; whereas in the absence of input, the model cortex resumed its spontaneous cycling. We conclude that the experimentally observed similarities between spontaneous and evoked activity in visual cortex can be explained as the outcome of a learning process that associates external stimuli with a preexisting reservoir of autonomous neural activity states. Our findings hence demonstrate how cortical connectivity can link the maintenance of spontaneous activity in the brain mechanistically to its core cognitive functions.
                                
Resumo:
Traditional retinal projections target three functionally complementary systems it) the brain of mammals: the primary visual system, the visuomotor integration systems and the circadian timing system. In recent years, studies in several animals have been conducted to investigate the retinal projections to these three systems, despite some evidence of additional targets. The aim of this study was to disclose a previously unknown connection between the retina and the parabrachial complex of the common marmoset, by means of the intraocular injection of cholera toxin Subunit b. A few labeled retinal fibers/terminals that are detected in the medial parabrachial portion of the marmoset brain show clear varicosities, Suggesting terminal fields. Although the possible role of these projections remains unknown, they may provide a modulation of the cholinergic parabrachial neurons which project to the thalamic dorsal lateral geniculate nucleus. (c) 2008 Elsevier Ireland Ltd. All rights reserved.
                                
Resumo:
Biological systems have facility to capture salient object(s) in a given scene, but it is still a difficult task to be accomplished by artificial vision systems. In this paper a visual selection mechanism based on the integrate and fire neural network is proposed. The model not only can discriminate objects in a given visual scene, but also can deliver focus of attention to the salient object. Moreover, it processes a combination of relevant features of an input scene, such as intensity, color, orientation, and the contrast of them. In comparison to other visual selection approaches, this model presents several interesting features. It is able to capture attention of objects in complex forms, including those linearly nonseparable. Moreover, computer simulations show that the model produces results similar to those observed in natural vision systems.
                                
Resumo:
Desde os descobrimentos pioneiros de Hubel e Wiesel acumulou-se uma vasta literatura descrevendo as respostas neuronais do córtex visual primário (V1) a diferentes estímulos visuais. Estes estímulos consistem principalmente em barras em movimento, pontos ou grades, que são úteis para explorar as respostas dentro do campo receptivo clássico (CRF do inglês classical receptive field) a características básicas dos estímulos visuais como a orientação, direção de movimento, contraste, entre outras. Entretanto, nas últimas duas décadas, tornou-se cada vez mais evidente que a atividade de neurônios em V1 pode ser modulada por estímulos fora do CRF. Desta forma, áreas visuais primárias poderiam estar envolvidas em funções visuais mais complexas como, por exemplo, a separação de um objeto ou figura do seu fundo (segregação figura-fundo) e assume-se que as conexões intrínsecas de longo alcance em V1, assim como as conexões de áreas visuais superiores, estão ativamente envolvidas neste processo. Sua possível função foi inferida a partir da análise das variações das respostas induzidas por um estímulo localizado fora do CRF de neurônios individuais. Mesmo sendo muito provável que estas conexões tenham também um impacto tanto na atividade conjunta de neurônios envolvidos no processamento da figura quanto no potencial de campo, estas questões permanecem pouco estudadas. Visando examinar a modulação do contexto visual nessas atividades, coletamos potenciais de ação e potenciais de campo em paralelo de até 48 eletrodos implantados na área visual primária de gatos anestesiados. Estimulamos com grades compostas e cenas naturais, focando-nos na atividade de neurônios cujo CRF estava situado na figura. Da mesma forma, visando examinar a influência das conexões laterais, o sinal proveniente da área visual isotópica e contralateral foi removido através da desativação reversível por resfriamento. Fizemos isso devido a: i) as conexões laterais intrínsecas não podem ser facilmente manipuladas sem afetar diretamente os sinais que estão sendo medidos, ii) as conexões inter-hemisféricas compartilham as principais características anatômicas com a rede lateral intrínseca e podem ser vistas como uma continuação funcional das mesmas entre os dois hemisférios e iii) o resfriamento desativa as conexões de forma causal e reversível, silenciando temporariamente seu sinal, permitindo conclusões diretas a respeito da sua contribuição. Nossos resultados demonstram que o mecanismo de segmentação figurafundo se reflete nas taxas de disparo de neurônios individuais, assim como na potência do potencial de campo e na relação entre sua fase e os padrões de disparo produzidos pela população. Além disso, as conexões laterais inter-hemisféricas modulam estas variáveis dependendo da estimulação feita fora do CRF. Observamos também uma influência deste circuito lateral na coerência entre potenciais de campo entre eletrodos distantes. Em conclusão, nossos resultados dão suporte à ideia de um mecanismo complexo de segmentação figura-fundo atuando desde as áreas visuais primárias em diferentes escalas de frequência. Esse mecanismo parece envolver grupos de neurônios ativos sincronicamente e dependentes da fase do potencial de campo. Nossos resultados também são compatíveis com a hipótese que conexões laterais de longo alcance também fazem parte deste mecanismo
                                
Resumo:
The primary and accessory optic systems comprise two set of retinorecipient neural clusters. In this study, these visual related centers in the rock cavy were evaluated by using the retinal innervations pattern and Nissl staining cytoarchigtecture. After unilateral intraocular injection of cholera toxin B subunit and immunohistochemical reaction of coronal and sagittal sections from the diencephalon and midbrain region of rock cavy. Three subcortical centres of primary visual system were identified, superior colliculus, lateral geniculate complex and pretectal complex. The lateral geniculate complex is formed by a series of nuclei receiving direct visual information from the retina, dorsal lateral geniculate nucleus, intergeniculate leaflet and ventral lateral geniculate nucleus. The pretectal complex is formed by series of pretectal nuclei, medial pretectal nucleus, olivary pretectal nucleus, posterior pretectal nucleus, nucleus of the optic tract and anterior pretectal nucleus. In the accessory optic system, retinal terminals were observed in the dorsal terminal, lateral terminal and medial terminal nuclei as well as in the interstitial nucleus of the superior fasciculus, posterior fibres. All retinorecipient nuclei received bilateral input, with a contralateral predominance. This is the first study of this nature in the rock cavy and the results are compared with the data obtained for other species. The investigation represents a contribution to the knowledge regarding the organization of visual optic systems in relation to the biology of species.
                                
Resumo:
A CMOS/SOI circuit to decode Pulse-Width Modulation (PWM) signals is presented as part of a body-implanted neurostimulator for visual prosthesis. Since encoded data is the sole input to the circuit, the decoding technique is based on a novel double-integration concept and does not require low-pass filtering. Non-overlapping control phases are internally derived from the incoming pulses and a fast-settling comparator ensures good discrimination accuracy in the megahertz range. The circuit was integrated on a 2 mum single-metal thin-film CMOS/SOI fabrication process and has an effective area of 2 mm(2). Measured resolution of encoding parameter a is better than 10% at 6 MHz and V-DD = 3.3 V. Idle-mode consumption is 340 LW. Pulses of frequencies up to 15 MHz and alpha = 10% can be discriminated for 2.3 V less than or equal to V-DD less than or equal to 3.3 V. Such an excellent immunity to V-DD deviations meets a design specification with respect to inherent coupling losses on transmitting data and power by means of a transcutaneous link.
                                
Resumo:
Visual perception and action are strongly linked with parallel processing channels connecting the retina, the lateral geniculate nucleus, and the input layers of the primary visual cortex. Achromatic vision is provided by at least two of such channels formed by the M and P neurons. These cell pathways are similarly organized in primates having different lifestyles, including species that are diurnal, nocturnal, and which exhibit a variety of color vision phenotypes. We describe the M and P cell properties by 3D Gábor functions and their 3D Fourier transform. The M and P cells occupy different loci in the Gábor information diagram or Fourier Space. This separation allows the M and P pathways to transmit visual signals with distinct 6D joint entropy for space, spatial frequency, time, and temporal frequency. By combining the M and P impacts on the cortical neurons beyond V1 input layers, the cortical pathways are able to process aspects of visual stimuli with a better precision than it would be possible using the M or P pathway alone. This performance fulfils the requirements of different behavioral tasks.
 
                    