866 resultados para Camera vision system
Resumo:
A fundamental problem for any visual system with binocular overlap is the combination of information from the two eyes. Electrophysiology shows that binocular integration of luminance contrast occurs early in visual cortex, but a specific systems architecture has not been established for human vision. Here, we address this by performing binocular summation and monocular, binocular, and dichoptic masking experiments for horizontal 1 cycle per degree test and masking gratings. These data reject three previously published proposals, each of which predict too little binocular summation and insufficient dichoptic facilitation. However, a simple development of one of the rejected models (the twin summation model) and a completely new model (the two-stage model) provide very good fits to the data. Two features common to both models are gently accelerating (almost linear) contrast transduction prior to binocular summation and suppressive ocular interactions that contribute to contrast gain control. With all model parameters fixed, both models correctly predict (1) systematic variation in psychometric slopes, (2) dichoptic contrast matching, and (3) high levels of binocular summation for various levels of binocular pedestal contrast. A review of evidence from elsewhere leads us to favor the two-stage model. © 2006 ARVO.
Resumo:
The human visual system combines contrast information from the two eyes to produce a single cyclopean representation of the external world. This task requires both summation of congruent images and inhibition of incongruent images across the eyes. These processes were explored psychophysically using narrowband sinusoidal grating stimuli. Initial experiments focussed on binocular interactions within a single detecting mechanism, using contrast discrimination and contrast matching tasks. Consistent with previous findings, dichoptic presentation produced greater masking than monocular or binocular presentation. Four computational models were compared, two of which performed well on all data sets. Suppression between mechanisms was then investigated, using orthogonal and oblique stimuli. Two distinct suppressive pathways were identified, corresponding to monocular and dichoptic presentation. Both pathways impact prior to binocular summation of signals, and differ in their strengths, tuning, and response to adaptation, consistent with recent single-cell findings in cat. Strikingly, the magnitude of dichoptic masking was found to be spatiotemporally scale invariant, whereas monocular masking was dependent on stimulus speed. Interocular suppression was further explored using a novel manipulation, whereby stimuli were presented in dichoptic antiphase. Consistent with the predictions of a computational model, this produced weaker masking than in-phase presentation. This allowed the bandwidths of suppression to be measured without the complicating factor of additive combination of mask and test. Finally, contrast vision in strabismic amblyopia was investigated. Although amblyopes are generally believed to have impaired binocular vision, binocular summation was shown to be intact when stimuli were normalized for interocular sensitivity differences. An alternative account of amblyopia was developed, in which signals in the affected eye are subject to attenuation and additive noise prior to binocular combination.
Resumo:
Over the last ten years our understanding of early spatial vision has improved enormously. The long-standing model of probability summation amongst multiple independent mechanisms with static output nonlinearities responsible for masking is obsolete. It has been replaced by a much more complex network of additive, suppressive, and facilitatory interactions and nonlinearities across eyes, area, spatial frequency, and orientation that extend well beyond the classical recep-tive field (CRF). A review of a substantial body of psychophysical work performed by ourselves (20 papers), and others, leads us to the following tentative account of the processing path for signal contrast. The first suppression stage is monocular, isotropic, non-adaptable, accelerates with RMS contrast, most potent for low spatial and high temporal frequencies, and extends slightly beyond the CRF. Second and third stages of suppression are difficult to disentangle but are possibly pre- and post-binocular summation, and involve components that are scale invariant, isotropic, anisotropic, chromatic, achromatic, adaptable, interocular, substantially larger than the CRF, and saturated by contrast. The monocular excitatory pathways begin with half-wave rectification, followed by a preliminary stage of half-binocular summation, a square-law transducer, full binocular summation, pooling over phase, cross-mechanism facilitatory interactions, additive noise, linear summation over area, and a slightly uncertain decision-maker. The purpose of each of these interactions is far from clear, but the system benefits from area and binocular summation of weak contrast signals as well as area and ocularity invariances above threshold (a herd of zebras doesn't change its contrast when it increases in number or when you close one eye). One of many remaining challenges is to determine the stage or stages of spatial tuning in the excitatory pathway.
Resumo:
To make vision possible, the visual nervous system must represent the most informative features in the light pattern captured by the eye. Here we use Gaussian scale-space theory to derive a multiscale model for edge analysis and we test it in perceptual experiments. At all scales there are two stages of spatial filtering. An odd-symmetric, Gaussian first derivative filter provides the input to a Gaussian second derivative filter. Crucially, the output at each stage is half-wave rectified before feeding forward to the next. This creates nonlinear channels selectively responsive to one edge polarity while suppressing spurious or "phantom" edges. The two stages have properties analogous to simple and complex cells in the visual cortex. Edges are found as peaks in a scale-space response map that is the output of the second stage. The position and scale of the peak response identify the location and blur of the edge. The model predicts remarkably accurately our results on human perception of edge location and blur for a wide range of luminance profiles, including the surprising finding that blurred edges look sharper when their length is made shorter. The model enhances our understanding of early vision by integrating computational, physiological, and psychophysical approaches. © ARVO.
Resumo:
Marr's work offered guidelines on how to investigate vision (the theory - algorithm - implementation distinction), as well as specific proposals on how vision is done. Many of the latter have inevitably been superseded, but the approach was inspirational and remains so. Marr saw the computational study of vision as tightly linked to psychophysics and neurophysiology, but the last twenty years have seen some weakening of that integration. Because feature detection is a key stage in early human vision, we have returned to basic questions about representation of edges at coarse and fine scales. We describe an explicit model in the spirit of the primal sketch, but tightly constrained by psychophysical data. Results from two tasks (location-marking and blur-matching) point strongly to the central role played by second-derivative operators, as proposed by Marr and Hildreth. Edge location and blur are evaluated by finding the location and scale of the Gaussian-derivative `template' that best matches the second-derivative profile (`signature') of the edge. The system is scale-invariant, and accurately predicts blur-matching data for a wide variety of 1-D and 2-D images. By finding the best-fitting scale, it implements a form of local scale selection and circumvents the knotty problem of integrating filter outputs across scales. [Supported by BBSRC and the Wellcome Trust]
Resumo:
The human visual system is sensitive to second-order modulations of the local contrast (CM) or amplitude (AM) of a carrier signal. Second-order cues are detected independently of first-order luminance signals; however, it is not clear why vision should benet from second-order sensitivity. Analysis of the first-and second-order contents of natural images suggests that these cues tend to occur together, but their phase relationship varies. We have shown that in-phase combinations of LM and AM are perceived as a shaded corrugated surface whereas the anti-phase combination can be seen as corrugated when presented alone or as a flat material change when presented in a plaid containing the in-phase cue. We now extend these findings using new stimulus types and a novel haptic matching task. We also introduce a computational model based on initially separate first-and second-order channels that are combined within orientation and subsequently across orientation to produce a shading signal. Contrast gain control allows the LM + AM cue to suppress responses to the LM-AM when presented in a plaid. Thus, the model sees LM -AM as flat in these circumstances. We conclude that second-order vision plays a key role in disambiguating the origin of luminance changes within an image. © ARVO.
Resumo:
People readily perceive smooth luminance variations as being due to the shading produced by undulations of a 3-D surface (shape-from-shading). In doing so, the visual system must simultaneously estimate the shape of the surface and the nature of the illumination. Remarkably, shape-from-shading operates even when both these properties are unknown and neither can be estimated directly from the image. In such circumstances humans are thought to adopt a default illumination model. A widely held view is that the default illuminant is a point source located above the observer's head. However, some have argued instead that the default illuminant is a diffuse source. We now present evidence that humans may adopt a flexible illumination model that includes both diffuse and point source elements. Our model estimates a direction for the point source and then weights the contribution of this source according to a bias function. For most people the preferred illuminant direction is overhead with a strong diffuse component.
Functional neuroimaging and behavioural studies on global form processing in the human visual system
Resumo:
Magnetoencephalography (MEG), functional magnetic resonance imaging (fMRI) and behavioural experiments were used to investigate the neural processes underlying global form perception in human vision. Behavioural studies using Glass patterns examined sensitivity for detecting radial, rotational and horizontal structure. Neuroimaging experiments using either Glass patterns or arrays of Gabor patches determined the spatio-temporal neural responseto global form. MEG data were analysed using synthetic aperture magnetometry (SAM) to spatially map event-related cortical oscillatory power changes: the temporal sequencing of activity within a discrete cortical area was determined using a Morlet wavelet transform. A case study was conducted to determine the effects of strbismic amblyopia on global form processing: all other observers were normally-sighted. The main findings from normally-sighted observers were: 1) sensitivity to horizontal structure was less than for radial or rotational structure; 2) the neural response to global structure was a reduction in cortical oscillatory power (10-30 Hz) within a network of extrastriate areas, including V4 and V3a; 3) the extend of reduced cortical power was least for horizontal patters; 4) V1 was not identified as a region of peak activity with either MEG or fMRI. The main findings with the strabismic amblyope were: 1) sensitivity for detection of radial, rotational, and horizontal structure was reduced when viewed with the amblyopic- relative to the fellow- eye; 2) cortical power changes within V4 to the presentation of rotational Glass patterns were less when viewed with the amblyopic- compared with the fellow- eye. The main conclusions are: 1) a network of extrastriate cortical areas are involved in the analysis of global form, with the most prominent change in neural activity being a reduction in oscillatory power within the 10-30 Hz band; 2) in strabismic amblyopia, the neuronal assembly associated with form perception in extrastriate cortex may be dysfunctional, the nature of this dysfunction may be a change in the normal temporal pattern of neuronal discharges; 3) MEG, fMRI and behavioural measures support the notion that different neural processes underlie the perception of horizontal as opposed to radial or rotational structure.
Resumo:
This thesis presents a study of how edges are detected and encoded by the human visual system. The study begins with theoretical work on the development of a model of edge processing, and includes psychophysical experiments on humans, and computer simulations of these experiments, using the model. The first chapter reviews the literature on edge processing in biological and machine vision, and introduces the mathematical foundations of this area of research. The second chapter gives a formal presentation of a model of edge perception that detects edges and characterizes their blur, contrast and orientation, using Gaussian derivative templates. This model has previously been shown to accurately predict human performance in blur matching tasks with several different types of edge profile. The model provides veridical estimates of the blur and contrast of edges that have a Gaussian integral profile. Since blur and contrast are independent parameters of Gaussian edges, the model predicts that varying one parameter should not affect perception of the other. Psychophysical experiments showed that this prediction is incorrect: reducing the contrast makes an edge look sharper; increasing the blur reduces the perceived contrast. Both of these effects can be explained by introducing a smoothed threshold to one of the processing stages of the model. It is shown that, with this modification,the model can predict the perceived contrast and blur of a number of edge profiles that differ markedly from the ideal Gaussian edge profiles on which the templates are based. With only a few exceptions, the results from all the experiments on blur and contrast perception can be explained reasonably well using one set of parameters for each subject. In the few cases where the model fails, possible extensions to the model are discussed.
Resumo:
Separate physiological mechanisms which respond to spatial and temporal stimulation have been identified in the visual system. Some pathological conditions may selectively affect these mechanisms, offering a unique opportunity to investigate how psychophysical and electrophysiological tests reflect these visual processes, and thus enhance the use of the tests in clinical diagnosis. Amblyopia and optical blur were studied, representing spatial visual defects of neural and optical origin, respectively. Selective defects of the visual pathways were also studied - optic neuritis which affects the optic nerve, and dementia of the Alzheimer type in which the higher association areas are believed to be affected, but the primary projections spared. Seventy control subjects from 10 to 79 years of age were investigated. This provided material for an additional study of the effect of age on the psychophysical and electrophysiological responses. Spatial processing was measured by visual acuity, the contrast sensitivity function, or spatial modulation transfer function (MTF), and the pattern reversal and pattern onset-offset visual evoked potential (VEP). Temporal, or luminance, processing was measured by the de Lange curve, or temporal MTF, and the flash VEP. The pattern VEP was shown to reflect the integrity of the optic nerve, geniculo striate pathway and primary projections, and was related to high temporal frequency processing. The individual components of the flash VEP differed in their characteristics. The results suggested that the P2 component reflects the function of the higher association areas and is related to low temporal frequency processing, while the Pl component reflects the primary projection areas. The combination of a delayed flash P2 component and a normal latency pattern VEP appears to be specific to dementia of the Alzheimer type and represents an important diagnostic test for this condition.
Resumo:
The aim of this work was to investigate human contrast perception at various contrast levels ranging from detection threshold to suprathreshold levels by using psychophysical techniques. The work consists of two major parts. The first part deals with contrast matching, and the second part deals with contrast discrimination. Contrast matching technique was used to determine when the perceived contrasts of different stimuli were equal. The effects of spatial frequency, stimulus area, image complexity and chromatic contrast on contrast detection thresholds and matches were studied. These factors influenced detection thresholds and perceived contrast at low contrast levels. However, at suprathreshold contrast levels perceived contrast became directly proportional to the physical contrast of the stimulus and almost independent of factors affecting detection thresholds. Contrast discrimination was studied by measuring contrast increment thresholds which indicate the smallest detectable contrast difference. The effects of stimulus area, external spatial image noise and retinal illuminance were studied. The above factors affected contrast detection thresholds and increment thresholds measured at low contrast levels. At high contrast levels, contrast increment thresholds became very similar so that the effect of these factors decreased. Human contrast perception was modelled by regarding the visual system as a simple image processing system. A visual signal is first low-pass filtered by the ocular optics. This is followed by spatial high-pass filtering by the neural visual pathways, and addition of internal neural noise. Detection is mediated by a local matched filter which is a weighted replica of the stimulus whose sampling efficiency decreases with increasing stimulus area and complexity. According to the model, the signals to be compared in a contrast matching task are first transferred through the early image processing stages mentioned above. Then they are filtered by a restoring transfer function which compensates for the low-level filtering and limited spatial integration at high contrast levels. Perceived contrasts of the stimuli are equal when the restored responses to the stimuli are equal. According to the model, the signals to be discriminated in a contrast discrimination task first go through the early image processing stages, after which signal dependent noise is added to the matched filter responses. The decision made by the human brain is based on the comparison between the responses of the matched filters to the stimuli, and the accuracy of the decision is limited by pre- and post-filter noises. The model for human contrast perception could accurately describe the results of contrast matching and discrimination in various conditions.
Resumo:
After thirty years of vacillation, the Tanzanian government has made a firm decision to Swahilize its secondary education system. It has also embarked on an ambitious economic and social development programme (Vision 2025) to transform its peasant society into a modern agricultural community. However, there is a faction in Tanzania opposed to Kiswahili as the medium of education. Already many members of the middle and upper class their children to English medium primary schools to avoid the Kiswahili medium public schools and to prepare their children for the English medium secondary system presently in place. Within the education system, particularly at university level, there is a desire to maintain English as the medium of education. English is seen to provide access to the international scientific community, to cutting edge technology and to the global economy. My interest in this conflict of interests stems from several years' experience teaching English to students at Sokoine University of Agriculture. Students specialise in agriculture and are expected to work with the peasant population on graduation. The students experience difficulties studying in English and then find their Kiswahili skills insufficient to explain to farmers the new techniques and technologies that they have studied in English. They are hampered by a complex triglossic situation in which they use their mother tongue with family and friends, Kiswahili, the national language for early education and most public communication within Tanzania, and English for advanced studies. My aim in this thesis was - to study the language policy in Tanzania and see how it is understood and implemented; - to examine the attitudes towards the various languages and their various roles; - to investigate actual language behaviour in Tanzanian higher education. My conclusion is that the dysfunctionality of the present study has to be addressed. Diglossic public life in Tanzania has to be accommodated. The only solution appears to be a compromise, namely a bilingual education system which supports from all cases of society by using Kiswahili, together with an early introduction of English and its promotion as a privileged foreign language, so that Tanzania can continue to develop internally through Kiswahili and at the same time retain access to the globalising world through the medium of English.
Resumo:
Multiple system atrophy (MSA) is a rare movement disorder and a member of a group of neurodegenerative diseases referred to collectively as the ‘parkinsonian syndromes’. Characteristic of these syndromes is that the patient exhibits symptoms of ‘parkinsonism’, viz., a range of problems involving movement, most typically manifest in Parkinson’s disease (PD) itself1, but also seen in progressive supranuclear palsy (PSP), and to some extent in dementia with Lewy bodies (DLB). MSA is a relatively ‘new’ descriptive term and is derived from three previously described diseases, viz., olivopontocerebellar atrophy, striato-nigral degeneration, and Shy-Drager syndrome. The classical symptoms of MSA include parkinsonism, ataxia, and autonomic dysfunction.6 Ataxia describes a gross lack of coordination of muscle movements while autonomic dysfunction involves a variety of systems that regulate unconscious bodily functions such as heart rate, blood pressure, bladder function, and digestion. Although primarily a neurological disorder, patients with MSA may also develop visual signs and symptoms that could be useful in differential diagnosis. The most important visual signs may include oculomotor dysfunction and problems in pupil reactivity but are less likely to involve aspects of primary vision such as visual acuity, colour vision, and visual fields. In addition, the eye-care practitioner can contribute to the management of the visual problems of MSA and therefore, help to improve quality of life of the patient. Hence, this first article in a two-part series describes the general features of MSA including its prevalence, signs and symptoms, diagnosis, pathology, and possible causes.
Resumo:
This research develops a low cost remote sensing system for use in agricultural applications. The important features of the system are that it monitors the near infrared and it incorporates position and attitude measuring equipment allowing for geo-rectified images to be produced without the use of ground control points. The equipment is designed to be hand held and hence requires no structural modification to the aircraft. The portable remote sensing system consists of an inertia measurement unit (IMU), which is accelerometer based, a low-cost GPS device and a small format false colour composite digital camera. The total cost of producing such a system is below GBP 3000, which is far cheaper than equivalent existing systems. The design of the portable remote sensing device has eliminated bore sight misalignment errors from the direct geo-referencing process. A new processing technique has been introduced for the data obtained from these low-cost devices, and it is found that using this technique the image can be matched (overlaid) onto Ordnance Survey Master Maps at an accuracy compatible with precision agriculture requirements. The direct geo-referencing has also been improved by introducing an algorithm capable of correcting oblique images directly. This algorithm alters the pixels value, hence it is advised that image analysis is performed before image georectification. The drawback of this research is that the low-cost GPS device experienced bad checksum errors, which resulted in missing data. The Wide Area Augmented System (WAAS) correction could not be employed because the satellites could not be locked onto whilst flying. The best GPS data were obtained from the Garmin eTrex (15 m kinematic and 2 m static) instruments which have a highsensitivity receiver with good lock on capability. The limitation of this GPS device is the inability to effectively receive the P-Code wavelength, which is needed to gain the best accuracy when undertaking differential GPS processing. Pairing the carrier phase L1 with the pseudorange C/A-Code received, in order to determine the image coordinates by the differential technique, is still under investigation. To improve the position accuracy, it is recommended that a GPS base station should be established near the survey area, instead of using a permanent GPS base station established by the Ordnance Survey.
Resumo:
This paper addresses the problem of obtaining 3d detailed reconstructions of human faces in real-time and with inexpensive hardware. We present an algorithm based on a monocular multi-spectral photometric-stereo setup. This system is known to capture high-detailed deforming 3d surfaces at high frame rates and without having to use any expensive hardware or synchronized light stage. However, the main challenge of such a setup is the calibration stage, which depends on the lights setup and how they interact with the specific material being captured, in this case, human faces. For this purpose we develop a self-calibration technique where the person being captured is asked to perform a rigid motion in front of the camera, maintaining a neutral expression. Rigidity constrains are then used to compute the head's motion with a structure-from-motion algorithm. Once the motion is obtained, a multi-view stereo algorithm reconstructs a coarse 3d model of the face. This coarse model is then used to estimate the lighting parameters with a stratified approach: In the first step we use a RANSAC search to identify purely diffuse points on the face and to simultaneously estimate this diffuse reflectance model. In the second step we apply non-linear optimization to fit a non-Lambertian reflectance model to the outliers of the previous step. The calibration procedure is validated with synthetic and real data.