935 resultados para Visual Object Identification Task
Resumo:
The Antarctic Pack Ice Seal (APIS) Program was initiated in 1994 to estimate the abundance of four species of Antarctic phocids: the crabeater seal Lobodon carcinophaga, Weddell seal Leptonychotes weddellii, Ross seal Ommatophoca rossii and leopard seal Hydrurga leptonyx and to identify ecological relationships and habitat use patterns. The Atlantic sector of the Southern Ocean (the eastern sector of the Weddell Sea) was surveyed by research teams from Germany, Norway and South Africa using a range of aerial methods over five austral summers between 1996-1997 and 2000-2001. We used these observations to model densities of seals in the area, taking into account haul-out probabilities, survey-specific sighting probabilities and covariates derived from satellite-based ice concentrations and bathymetry. These models predicted the total abundance over the area bounded by the surveys (30°W and 10°E). In this sector of the coast, we estimated seal abundances of: 514 (95 % CI 337-886) x 10**3 crabeater seals, 60.0 (43.2-94.4) x 10**3 Weddell seals and 13.2 (5.50-39.7) x 10**3 leopard seals. The crabeater seal densities, approximately 14,000 seals per degree longitude, are similar to estimates obtained by surveys in the Pacific and Indian sectors by other APIS researchers. Very few Ross seals were observed (24 total), leading to a conservative estimate of 830 (119-2894) individuals over the study area. These results provide an important baseline against which to compare future changes in seal distribution and abundance.
Resumo:
Corticobasal degeneration is a rare, progressive neurodegenerative disease and a member of the 'parkinsonian' group of disorders, which also includes Parkinson's disease, progressive supranuclear palsy, dementia with Lewy bodies and multiple system atrophy. The most common initial symptom is limb clumsiness, usually affecting one side of the body, with or without accompanying rigidity or tremor. Subsequently, the disease affects gait and there is a slow progression to influence ipsilateral arms and legs. Apraxia and dementia are the most common cortical signs. Corticobasal degeneration can be difficult to distinguish from other parkinsonian syndromes but if ocular signs and symptoms are present, they may aid clinical diagnosis. Typical ocular features include increased latency of saccadic eye movements ipsilateral to the side exhibiting apraxia, impaired smooth pursuit movements and visuo-spatial dysfunction, especially involving spatial rather than object-based tasks. Less typical features include reduction in saccadic velocity, vertical gaze palsy, visual hallucinations, sleep disturbance and an impaired electroretinogram. Aspects of primary vision such as visual acuity and colour vision are usually unaffected. Management of the condition to deal with problems of walking, movement, daily tasks and speech problems is an important aspect of the disease.
Resumo:
The occurrences of visual hallucinations seem to be more prevalent in low light and hallucinators tend to be more prone to false positive type errors in memory tasks. Here we investigated whether the richness of stimuli does indeed affect recognition differently in hallucinating and nonhallucinating participants, and if so whether this difference extends to identifying spatial context. We compared 36 Parkinson's disease (PD) patients with visual hallucinations, 32 Parkinson's patients without hallucinations, and 36 age-matched controls, on a visual memory task where color and black and white pictures were presented at different locations. Participants had to recognize the pictures among distracters along with the location of the stimulus. Findings revealed clear differences in performance between the groups. Both PD groups had impaired recognition compared to the controls, but those with hallucinations were significantly more impaired on black and white than on color stimuli. In addition, the group with hallucinations was significantly impaired compared to the other two groups on spatial memory. We suggest that not only do PD patients have poorer recognition of pictorial stimuli than controls, those who present with visual hallucinations appear to be more heavily reliant on bottom up sensory input and impaired on spatial ability.
Resumo:
Loss of limb results in loss of function and a partial loss of freedom. A powered prosthetic device can partially assist an individual with everyday tasks and therefore return some level of independence. Powered upper limb prostheses are often controlled by the user generating surface electromyographic (SEMG) signals. The goal of this thesis is to develop a virtual environment in which a user can control a virtual hand to safely grasp representations of everyday objects using EMG signals from his/her forearm muscles, and experience visual and vibrotactile feedback relevant to the grasping force in the process. This can then be used to train potential wearers of real EMG controlled prostheses, with or without vibrotactile feedback. To test this system an experiment was designed and executed involving ten subjects, twelve objects, and three feedback conditions. The tested feedback conditions were visual, vibrotactile, and both visual and vibrotactile. In each experimental exercise the subject attempted to grasp a virtual object on the screen using the virtual hand controlled by EMG electrodes placed on his/her forearm. Two metrics were used: score, and time to task completion, where score measured grasp dexterity. It was hypothesized that with the introduction of vibrotactile feedback, dexterity, and therefore score, would improve and time to task completion would decrease. Results showed that time to task completion increased, and score did not improve with vibrotactile feedback. Details on the developed system, the experiment, and the results are presented in this thesis.
Resumo:
The police use both subjective (i.e. police staff) and automated (e.g. face recognition systems) methods for the completion of visual tasks (e.g person identification). Image quality for police tasks has been defined as the image usefulness, or image suitability of the visual material to satisfy a visual task. It is not necessarily affected by any artefact that may affect the visual image quality (i.e. decrease fidelity), as long as these artefacts do not affect the relevant useful information for the task. The capture of useful information will be affected by the unconstrained conditions commonly encountered by CCTV systems such as variations in illumination and high compression levels. The main aim of this thesis is to investigate aspects of image quality and video compression that may affect the completion of police visual tasks/applications with respect to CCTV imagery. This is accomplished by investigating 3 specific police areas/tasks utilising: 1) the human visual system (HVS) for a face recognition task, 2) automated face recognition systems, and 3) automated human detection systems. These systems (HVS and automated) were assessed with defined scene content properties, and video compression, i.e. H.264/MPEG-4 AVC. The performance of imaging systems/processes (e.g. subjective investigations, performance of compression algorithms) are affected by scene content properties. No other investigation has been identified that takes into consideration scene content properties to the same extend. Results have shown that the HVS is more sensitive to compression effects in comparison to the automated systems. In automated face recognition systems, `mixed lightness' scenes were the most affected and `low lightness' scenes were the least affected by compression. In contrast the HVS for the face recognition task, `low lightness' scenes were the most affected and `medium lightness' scenes the least affected. For the automated human detection systems, `close distance' and `run approach' are some of the most commonly affected scenes. Findings have the potential to broaden the methods used for testing imaging systems for security applications.
Resumo:
This work presents the design of a real-time system to model visual objects with the use of self-organising networks. The architecture of the system addresses multiple computer vision tasks such as image segmentation, optimal parameter estimation and object representation. We first develop a framework for building non-rigid shapes using the growth mechanism of the self-organising maps, and then we define an optimal number of nodes without overfitting or underfitting the network based on the knowledge obtained from information-theoretic considerations. We present experimental results for hands and faces, and we quantitatively evaluate the matching capabilities of the proposed method with the topographic product. The proposed method is easily extensible to 3D objects, as it offers similar features for efficient mesh reconstruction.
Resumo:
In this work, we propose a biologically inspired appearance model for robust visual tracking. Motivated in part by the success of the hierarchical organization of the primary visual cortex (area V1), we establish an architecture consisting of five layers: whitening, rectification, normalization, coding and polling. The first three layers stem from the models developed for object recognition. In this paper, our attention focuses on the coding and pooling layers. In particular, we use a discriminative sparse coding method in the coding layer along with spatial pyramid representation in the pooling layer, which makes it easier to distinguish the target to be tracked from its background in the presence of appearance variations. An extensive experimental study shows that the proposed method has higher tracking accuracy than several state-of-the-art trackers.
Resumo:
Objective
Pedestrian detection under video surveillance systems has always been a hot topic in computer vision research. These systems are widely used in train stations, airports, large commercial plazas, and other public places. However, pedestrian detection remains difficult because of complex backgrounds. Given its development in recent years, the visual attention mechanism has attracted increasing attention in object detection and tracking research, and previous studies have achieved substantial progress and breakthroughs. We propose a novel pedestrian detection method based on the semantic features under the visual attention mechanism.
Method
The proposed semantic feature-based visual attention model is a spatial-temporal model that consists of two parts: the static visual attention model and the motion visual attention model. The static visual attention model in the spatial domain is constructed by combining bottom-up with top-down attention guidance. Based on the characteristics of pedestrians, the bottom-up visual attention model of Itti is improved by intensifying the orientation vectors of elementary visual features to make the visual saliency map suitable for pedestrian detection. In terms of pedestrian attributes, skin color is selected as a semantic feature for pedestrian detection. The regional and Gaussian models are adopted to construct the skin color model. Skin feature-based visual attention guidance is then proposed to complete the top-down process. The bottom-up and top-down visual attentions are linearly combined using the proper weights obtained from experiments to construct the static visual attention model in the spatial domain. The spatial-temporal visual attention model is then constructed via the motion features in the temporal domain. Based on the static visual attention model in the spatial domain, the frame difference method is combined with optical flowing to detect motion vectors. Filtering is applied to process the field of motion vectors. The saliency of motion vectors can be evaluated via motion entropy to make the selected motion feature more suitable for the spatial-temporal visual attention model.
Result
Standard datasets and practical videos are selected for the experiments. The experiments are performed on a MATLAB R2012a platform. The experimental results show that our spatial-temporal visual attention model demonstrates favorable robustness under various scenes, including indoor train station surveillance videos and outdoor scenes with swaying leaves. Our proposed model outperforms the visual attention model of Itti, the graph-based visual saliency model, the phase spectrum of quaternion Fourier transform model, and the motion channel model of Liu in terms of pedestrian detection. The proposed model achieves a 93% accuracy rate on the test video.
Conclusion
This paper proposes a novel pedestrian method based on the visual attention mechanism. A spatial-temporal visual attention model that uses low-level and semantic features is proposed to calculate the saliency map. Based on this model, the pedestrian targets can be detected through focus of attention shifts. The experimental results verify the effectiveness of the proposed attention model for detecting pedestrians.
Resumo:
L’objectif principal de cette thèse était d’obtenir, via l’électrophysiologie cognitive, des indices de fonctionnement post-traumatisme craniocérébral léger (TCCL) pour différents niveaux de traitement de l’information, soit l’attention sélective, les processus décisionnels visuoattentionnels et les processus associés à l’exécution d’une réponse volontaire. L’hypothèse centrale était que les mécanismes de production des lésions de même que la pathophysiologie caractérisant le TCCL engendrent des dysfonctions visuoattentionnelles, du moins pendant la période aiguë suivant le TCCL (i.e. entre 1 et 3 mois post-accident), telles que mesurées à l’aide d’un nouveau paradigme électrophysiologique conçu à cet effet. Cette thèse présente deux articles qui décrivent le travail effectué afin de rencontrer ces objectifs et ainsi vérifier les hypothèses émises. Le premier article présente la démarche réalisée afin de créer une nouvelle tâche d’attention visuospatiale permettant d’obtenir les indices électrophysiologiques (amplitude, latence) et comportementaux (temps de réaction) liés aux processus de traitement visuel et attentionnel précoce (P1, N1, N2-nogo, P2, Ptc) à l’attention visuelle sélective (N2pc, SPCN) et aux processus décisionnels (P3b, P3a) chez un groupe de participants sains (i.e. sans atteinte neurologique). Le deuxième article présente l’étude des effets persistants d’un TCCL sur les fonctions visuoattentionelles via l’obtention des indices électrophysiologiques ciblés (amplitude, latence) et de données comportementales (temps de réaction à la tâche et résultats aux tests neuropsychologiques) chez deux cohortes d’individus TCCL symptomatiques, l’une en phase subaigüe (3 premiers mois post-accident), l’autre en phase chronique (6 mois à 1 an post-accident), en comparaison à un groupe de participants témoins sains. Les résultats des articles présentés dans cette thèse montrent qu’il a été possible de créer une tâche simple qui permet d’étudier de façon rapide et peu coûteuse les différents niveaux de traitement de l’information impliqués dans le déploiement de l’attention visuospatiale. Par la suite, l’utilisation de cette tâche auprès d’individus atteints d’un TCCL testés en phase sub-aiguë ou en phase chronique a permis d’objectiver des profils d’atteintes et de récupération différentiels pour chacune des composantes étudiées. En effet, alors que les composantes associées au traitement précoce de l’information visuelle (P1, N1, N2) étaient intactes, certaines composantes attentionnelles (P2) et cognitivo-attentionnelles (P3a, P3b) étaient altérées, suggérant une dysfonction au niveau des dynamiques spatio-temporelles de l’attention, de l’orientation de l’attention et de la mémoire de travail, à court et/ou à long terme après le TCCL, ceci en présence de déficits neuropsychologiques en phase subaiguë surtout et d’une symptomatologie post-TCCL persistante. Cette thèse souligne l’importance de développer des outils diagnostics sensibles et exhaustifs permettant d’objectiver les divers processus et sous-processus cognitifs susceptible d’être atteints après un TCCL.
Resumo:
L’automatisation de la détection et de l’identification des animaux est une tâche qui a de l’intérêt dans plusieurs domaines de recherche en biologie ainsi que dans le développement de systèmes de surveillance électronique. L’auteur présente un système de détection et d’identification basé sur la vision stéréo par ordinateur. Plusieurs critères sont utilisés pour identifier les animaux, mais l’accent a été mis sur l’analyse harmonique de la reconstruction en temps réel de la forme en 3D des animaux. Le résultat de l’analyse est comparé avec d’autres qui sont contenus dans une base évolutive de connaissances.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
The degree to which a person relies on visual stimuli for spatial orientation is termed visual dependency (VD). VD is considered a perceptual trait or cognitive style influenced by psychological factors and mediated by central re-weighting of the sensory inputs involved in spatial orientation. VD is often measured using the rod-and-disk test, wherein participants align a central rod to the subjective visual vertical (SVV) in the presence of a background that is either stationary or rotating around the line of sight - dynamic SVV. Although this task has been employed to assess VD in health and vestibular disease, it is unknown what effect torsional nystagmic eye movements may have on individual performance. Using caloric ear irrigation, 3D video-oculography and the rod-and-disk test, we show that caloric torsional nystagmus modulates measures of visual dependency and demonstrate that increases in tilt after irrigation are positively correlated with changes in ocular torsional eye movements. When the direction of the slow phase of the torsional eye movement induced by the caloric is congruent with that induced by the rotating visual stimulus, there is a significant increase in tilt. When these two torsional components are in opposition there is a decrease. These findings show that measures of visual dependence can be influenced by oculomotor responses induced by caloric stimulation. The findings are of significance for clinical studies as they indicate that VD, which often increases in vestibular disorders, is not only modulated by changes in cognitive style but also by eye movements, in particular nystagmus.
Resumo:
The locative project is in a condition of emergence, an embryonic state in which everything is still up for grabs, a zone of consistency yet to emerge. As an emergent practice locative art, like locative media generally, it is simultaneously opening up new ways of engaging in the world and mapping its own domain. (Drew Hemment, 2004) Artists and scientists have always used whatever emerging technologies existed at their particular time throughout history to push the boundaries of their fields of practice. The use of new technologies or the notion of ‘new’ media is neither particularly new nor novel. Humans are adaptive, evolving and will continue to invent and explore technological innovation. This paper asks the following questions: what role does adaptive and/or intelligent art play in the future of public spaces, and how does this intervention alter the relationship between theory and practice? Does locative or installation-based art reach more people, and does ‘intelligent’ or ‘smart’ art have a larger role to play in the beginning of this century? The speakers will discuss their current collaborative prototype and within the presentation demonstrate how software art has the potential to activate public spaces, and therefore contribute to a change in spatial or locative awareness. It is argued that the role and perhaps even the representation of the audience/viewer is left altered through this intervention. 1. A form of electronic imagery created by a collection of mathematically defined lines and/or curves. 2. An experiential form of art which engages the viewer both from within a specific location and in response to their intentional or unintentional input.
Resumo:
Terrestrial remote sensing imagery involves the acquisition of information from the Earth's surface without physical contact with the area under study. Among the remote sensing modalities, hyperspectral imaging has recently emerged as a powerful passive technology. This technology has been widely used in the fields of urban and regional planning, water resource management, environmental monitoring, food safety, counterfeit drugs detection, oil spill and other types of chemical contamination detection, biological hazards prevention, and target detection for military and security purposes [2-9]. Hyperspectral sensors sample the reflected solar radiation from the Earth surface in the portion of the spectrum extending from the visible region through the near-infrared and mid-infrared (wavelengths between 0.3 and 2.5 µm) in hundreds of narrow (of the order of 10 nm) contiguous bands [10]. This high spectral resolution can be used for object detection and for discriminating between different objects based on their spectral xharacteristics [6]. However, this huge spectral resolution yields large amounts of data to be processed. For example, the Airbone Visible/Infrared Imaging Spectrometer (AVIRIS) [11] collects a 512 (along track) X 614 (across track) X 224 (bands) X 12 (bits) data cube in 5 s, corresponding to about 140 MBs. Similar data collection ratios are achieved by other spectrometers [12]. Such huge data volumes put stringent requirements on communications, storage, and processing. The problem of signal sbspace identification of hyperspectral data represents a crucial first step in many hypersctral processing algorithms such as target detection, change detection, classification, and unmixing. The identification of this subspace enables a correct dimensionality reduction (DR) yelding gains in data storage and retrieval and in computational time and complexity. Additionally, DR may also improve algorithms performance since it reduce data dimensionality without losses in the useful signal components. The computation of statistical estimates is a relevant example of the advantages of DR, since the number of samples required to obtain accurate estimates increases drastically with the dimmensionality of the data (Hughes phnomenon) [13].
Resumo:
Purpose - In this study we aim to validate a method to assess the impact of reduced visual function and observer performance concurrently with a nodule detection task. Materials and methods - Three consultant radiologists completed a nodule detection task under three conditions: without visual defocus (0.00 Dioptres; D), and with two different magnitudes of visual defocus (−1.00 D and −2.00 D). Defocus was applied with lenses and visual function was assessed prior to each image evaluation. Observers evaluated the same cases on each occasion; this comprised of 50 abnormal cases containing 1–4 simulated nodules (5, 8, 10 and 12 mm spherical diameter, 100 HU) placed within a phantom, and 25 normal cases (images containing no nodules). Data was collected under the free-response paradigm and analysed using Rjafroc. A difference in nodule detection performance would be considered significant at p < 0.05. Results - All observers had acceptable visual function prior to beginning the nodule detection task. Visual acuity was reduced to an unacceptable level for two observers when defocussed to −1.00 D and for one observer when defocussed to −2.00 D. Stereoacuity was unacceptable for one observer when defocussed to −2.00 D. Despite unsatisfactory visual function in the presence of defocus we were unable to find a statistically significant difference in nodule detection performance (F(2,4) = 3.55, p = 0.130). Conclusion - A method to assess visual function and observer performance is proposed. In this pilot evaluation we were unable to detect any difference in nodule detection performance when using lenses to reduce visual function.