978 resultados para Visual identification tasks


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this report we summarize the state-of-the-art of speech emotion recognition from the signal processing point of view. On the bases of multi-corporal experiments with machine-learning classifiers, the observation is made that existing approaches for supervised machine learning lead to database dependent classifiers which can not be applied for multi-language speech emotion recognition without additional training because they discriminate the emotion classes following the used training language. As there are experimental results showing that Humans can perform language independent categorisation, we made a parallel between machine recognition and the cognitive process and tried to discover the sources of these divergent results. The analysis suggests that the main difference is that the speech perception allows extraction of language independent features although language dependent features are incorporated in all levels of the speech signal and play as a strong discriminative function in human perception. Based on several results in related domains, we have suggested that in addition, the cognitive process of emotion-recognition is based on categorisation, assisted by some hierarchical structure of the emotional categories, existing in the cognitive space of all humans. We propose a strategy for developing language independent machine emotion recognition, related to the identification of language independent speech features and the use of additional information from visual (expression) features.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Congenital nystagmus (CN) is an ocular-motor disorder characterised by involuntary, conjugated ocular oscillations and its pathogenesis is still under investigation. This kind of nystagmus is termed congenital (or infantile) since it could be present at birth or it can arise in the first months of life. Most of CN patients show a considerable decrease of their visual acuity: image fixation on the retina is disturbed by nystagmus continuous oscillations, mainly horizontal. However, the image of a given target can still be stable during short periods in which eye velocity slows down while the target image is placed onto the fovea (called foveation intervals). To quantify the extent of nystagmus, eye movement recording are routinely employed, allowing physicians to extract and analyse nystagmus main features such as waveform shape, amplitude and frequency. Using eye movement recording, it is also possible to compute estimated visual acuity predictors: analytical functions which estimates expected visual acuity using signal features such as foveation time and foveation position variability. Use of those functions extend the information from typical visual acuity measurement (e.g. Landolt C test) and could be a support for therapy planning or monitoring. This study focuses on detection of CN patients' waveform type and on foveation time measure. Specifically, it proposes a robust method to recognize cycles corresponding to the specific CN waveform in the eye movement pattern and, for those cycles, evaluate the exact signal tracts in which a subject foveates. About 40 eyemovement recordings, either infrared-oculographic or electrooculographic, were acquired from 16 CN subjects. Results suggest that the use of an adaptive threshold applied to the eye velocity signal could improve the estimation of slow phase start point. This can enhance foveation time computing and reduce influence of repositioning saccades and data noise on the waveform type identification.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We report an extension of the procedure devised by Weinstein and Shanks (Memory & Cognition 36:1415-1428, 2008) to study false recognition and priming of pictures. Participants viewed scenes with multiple embedded objects (seen items), then studied the names of these objects and the names of other objects (read items). Finally, participants completed a combined direct (recognition) and indirect (identification) memory test that included seen items, read items, and new items. In the direct test, participants recognized pictures of seen and read items more often than new pictures. In the indirect test, participants' speed at identifying those same pictures was improved for pictures that they had actually studied, and also for falsely recognized pictures whose names they had read. These data provide new evidence that a false-memory induction procedure can elicit memory-like representations that are difficult to distinguish from "true" memories of studied pictures. © 2012 Psychonomic Society, Inc.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Purpose: Dementia is associated with various alterations of the eye and visual function. Over 60% of cases are attributable to Alzheimer's disease, a significant proportion of the remainder to vascular dementia or dementia with Lewy bodies, while frontotemporal dementia, and Parkinson's disease dementia are less common. This review describes the oculo-visual problems of these five dementias and the pathological changes which may explain these symptoms. It further discusses clinical considerations to help the clinician care for older patients affected by dementia. Recent findings: Visual problems in dementia include loss of visual acuity, defects in colour vision and visual masking tests, changes in pupillary response to mydriatics, defects in fixation and smooth and saccadic eye movements, changes in contrast sensitivity function and visual evoked potentials, and disturbance of complex visual functions such as in reading ability, visuospatial function, and the naming and identification of objects. Pathological changes have also been reported affecting the crystalline lens, retina, optic nerve, and visual cortex. Clinically, issues such as cataract surgery, correcting the refractive error, quality of life, falls, visual impairment and eye care for dementia have been addressed. Summary: Many visual changes occur across dementias, are controversial, often based on limited patient numbers, and no single feature can be regarded as diagnostic of any specific dementia. Nevertheless, visual hallucinations may be more characteristic of dementia with Lewy bodies and Parkinson's disease dementia than Alzheimer's disease or frontotemporal dementia. Differences in saccadic eye movement dysfunction may also help to distinguish Alzheimer's disease from frontotemporal dementia and Parkinson's disease dementia from dementia with Lewy bodies. Eye care professionals need to keep informed of the growing literature in vision/dementia, be attentive to signs and symptoms suggestive of cognitive impairment, and be able to adapt their practice and clinical interventions to best serve patients with dementia.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Current reform initiatives recommend that school geometry teaching and learning include the study of three-dimensional geometric objects and provide students with opportunities to use spatial abilities in mathematical tasks. Two ways of using Geometer's Sketchpad (GSP), a dynamic and interactive computer program, in conjunction with manipulatives enable students to investigate and explore geometric concepts, especially when used in a constructivist setting. Research on spatial abilities has focused on visual reasoning to improve visualization skills. This dissertation investigated the hypothesis that connecting visual and analytic reasoning may better improve students' spatial visualization abilities as compared to instruction that makes little or no use of the connection of the two. Data were collected using the Purdue Spatial Visualization Tests (PSVT) administered as a pretest and posttest to a control and two experimental groups. Sixty-four 10th grade students in three geometry classrooms participated in the study during 6 weeks. Research questions were answered using statistical procedures. An analysis of covariance was used for a quantitative analysis, whereas a description of students' visual-analytic processing strategies was presented using qualitative methods. The quantitative results indicated that there were significant differences in gender, but not in the group factor. However, when analyzing a sub sample of 33 participants with pretest scores below the 50th percentile, males in one of the experimental groups significantly benefited from the treatment. A review of previous research also indicated that students with low visualization skills benefited more than those with higher visualization skills. The qualitative results showed that girls were more sophisticated in their visual-analytic processing strategies to solve three-dimensional tasks. It is recommended that the teaching and learning of spatial visualization start in the middle school, prior to students' more rigorous mathematics exposure in high school. A duration longer than 6 weeks for treatments in similar future research studies is also recommended.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Medicine has changed in recent years. Medicare will all of its rules and regulations, worker's compensation laws, managed care and the trend toward more and larger group practices all contributed to the creation of an extremely structured regulatory environment which in turn demanded highly trained medical administrative assistants.^ The researcher noted three primary problems in the identification of competencies for the medical administrative assistant position: A lack of curricula, diverse roles, and a complex environment which has undergone radical change in recent years and will continue to evolve. Therefore, the purposes of the study were to use the DACUM process to develop a relevant list of competencies required by the medical administrative assistant practicing in physicians' offices in South Florida; determine the rank order of importance of each competency using a scale of one to five; cross-validate the DACUM group scores with a second population who did not participate in the DACUM process; and establish a basis for a curriculum framework for an occupational program.^ The DACUM process of curriculum development was selected because it seemed best suited to the need to develop a list of competencies for an occupation for which no programs existed. A panel of expert medical office administrative staff was selected to attend a 2-day workshop to describe their jobs in great detail. The panel, led by a trained facilitator, listed major duties and the respective tasks of their job. Brainstorming techniques were used to develop a consensus.^ Based upon the DACUM workshop, a survey was developed listing the 8 major duties and 71 tasks identified by the panel. The survey was mailed to the DACUM group and a second, larger population who did not participate in the DACUM. The survey results from the two groups were then compared. The non-DACUM group validated all but 3 of the 71 tasks listed by the DACUM panel. Because the three tasks were rated by the second group as at least "somewhat important" and rated "very important" by the DACUM group, the researcher recommended the inclusion of all 71 tasks in program development for this occupation. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

As you consider this catalogue of works, reflect upon the variety of tasks- intellectual, emotional and technical- that have led to this visible record of ability and expression. the interdisciplinary rigors of the visual arts are present in these pages, and the breadth of the skill, ability, problem-solving and communication that have been developed and refined during the years of study are portrayed an in this presentation of accomplishment. The Graduates represented in the following pages will go on to a variety of careers-teaching, making art, starting businesses,or following any number of diverse paths that they have prepared during their undergraduate years. The work they have chosen to present here is merely a synopsis of the broad spectrum of skills and abilities they have gained during their years at grenfell college.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Temporal-order judgment (TOJ) and simultaneity judgment (SJ) tasks are used to study differences in speed of processing across sensory modalities, stimulus types, or experimental conditions. Matthews and Welch (2015) reported that observed performance in SJ and TOJ tasks is superior when visual stimuli are presented in the left visual field (LVF) compared to the right visual field (RVF), revealing an LVF advantage presumably reflecting attentional influences. Because observed performance reflects the interplay of perceptual and decisional processes involved in carrying out the tasks, analyses that separate out these influences are needed to determine the origin of the LVF advantage. We re-analyzed the data of Matthews and Welch (2015) using a model of performance in SJ and TOJ tasks that separates out these influences. Parameter estimates capturing the operation of perceptual processes did not differ between hemifields by these analyses, whereas parameter estimates capturing the operation of decisional processes differed. In line with other evidence, perceptual processing also did not differ between SJ and TOJ tasks. Thus, the LVF advantage occurs with identical speeds of processing in both visual hemifields. If attention is responsible for the LVF advantage, it does not exert its influence via prior entry.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Corticobasal degeneration is a rare, progressive neurodegenerative disease and a member of the 'parkinsonian' group of disorders, which also includes Parkinson's disease, progressive supranuclear palsy, dementia with Lewy bodies and multiple system atrophy. The most common initial symptom is limb clumsiness, usually affecting one side of the body, with or without accompanying rigidity or tremor. Subsequently, the disease affects gait and there is a slow progression to influence ipsilateral arms and legs. Apraxia and dementia are the most common cortical signs. Corticobasal degeneration can be difficult to distinguish from other parkinsonian syndromes but if ocular signs and symptoms are present, they may aid clinical diagnosis. Typical ocular features include increased latency of saccadic eye movements ipsilateral to the side exhibiting apraxia, impaired smooth pursuit movements and visuo-spatial dysfunction, especially involving spatial rather than object-based tasks. Less typical features include reduction in saccadic velocity, vertical gaze palsy, visual hallucinations, sleep disturbance and an impaired electroretinogram. Aspects of primary vision such as visual acuity and colour vision are usually unaffected. Management of the condition to deal with problems of walking, movement, daily tasks and speech problems is an important aspect of the disease.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The occurrences of visual hallucinations seem to be more prevalent in low light and hallucinators tend to be more prone to false positive type errors in memory tasks. Here we investigated whether the richness of stimuli does indeed affect recognition differently in hallucinating and nonhallucinating participants, and if so whether this difference extends to identifying spatial context. We compared 36 Parkinson's disease (PD) patients with visual hallucinations, 32 Parkinson's patients without hallucinations, and 36 age-matched controls, on a visual memory task where color and black and white pictures were presented at different locations. Participants had to recognize the pictures among distracters along with the location of the stimulus. Findings revealed clear differences in performance between the groups. Both PD groups had impaired recognition compared to the controls, but those with hallucinations were significantly more impaired on black and white than on color stimuli. In addition, the group with hallucinations was significantly impaired compared to the other two groups on spatial memory. We suggest that not only do PD patients have poorer recognition of pictorial stimuli than controls, those who present with visual hallucinations appear to be more heavily reliant on bottom up sensory input and impaired on spatial ability.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The police use both subjective (i.e. police staff) and automated (e.g. face recognition systems) methods for the completion of visual tasks (e.g person identification). Image quality for police tasks has been defined as the image usefulness, or image suitability of the visual material to satisfy a visual task. It is not necessarily affected by any artefact that may affect the visual image quality (i.e. decrease fidelity), as long as these artefacts do not affect the relevant useful information for the task. The capture of useful information will be affected by the unconstrained conditions commonly encountered by CCTV systems such as variations in illumination and high compression levels. The main aim of this thesis is to investigate aspects of image quality and video compression that may affect the completion of police visual tasks/applications with respect to CCTV imagery. This is accomplished by investigating 3 specific police areas/tasks utilising: 1) the human visual system (HVS) for a face recognition task, 2) automated face recognition systems, and 3) automated human detection systems. These systems (HVS and automated) were assessed with defined scene content properties, and video compression, i.e. H.264/MPEG-4 AVC. The performance of imaging systems/processes (e.g. subjective investigations, performance of compression algorithms) are affected by scene content properties. No other investigation has been identified that takes into consideration scene content properties to the same extend. Results have shown that the HVS is more sensitive to compression effects in comparison to the automated systems. In automated face recognition systems, `mixed lightness' scenes were the most affected and `low lightness' scenes were the least affected by compression. In contrast the HVS for the face recognition task, `low lightness' scenes were the most affected and `medium lightness' scenes the least affected. For the automated human detection systems, `close distance' and `run approach' are some of the most commonly affected scenes. Findings have the potential to broaden the methods used for testing imaging systems for security applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the last decade, research in Computer Vision has developed several algorithms to help botanists and non-experts to classify plants based on images of their leaves. LeafSnap is a mobile application that uses a multiscale curvature model of the leaf margin to classify leaf images into species. It has achieved high levels of accuracy on 184 tree species from Northeast US. We extend the research that led to the development of LeafSnap along two lines. First, LeafSnap’s underlying algorithms are applied to a set of 66 tree species from Costa Rica. Then, texture is used as an additional criterion to measure the level of improvement achieved in the automatic identification of Costa Rica tree species. A 25.6% improvement was achieved for a Costa Rican clean image dataset and 42.5% for a Costa Rican noisy image dataset. In both cases, our results show this increment as statistically significant. Further statistical analysis of visual noise impact, best algorithm combinations per species, and best value of k , the minimal cardinality of the set of candidate species that the tested algorithms render as best matches is also presented in this research

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The goal of image retrieval and matching is to find and locate object instances in images from a large-scale image database. While visual features are abundant, how to combine them to improve performance by individual features remains a challenging task. In this work, we focus on leveraging multiple features for accurate and efficient image retrieval and matching. We first propose two graph-based approaches to rerank initially retrieved images for generic image retrieval. In the graph, vertices are images while edges are similarities between image pairs. Our first approach employs a mixture Markov model based on a random walk model on multiple graphs to fuse graphs. We introduce a probabilistic model to compute the importance of each feature for graph fusion under a naive Bayesian formulation, which requires statistics of similarities from a manually labeled dataset containing irrelevant images. To reduce human labeling, we further propose a fully unsupervised reranking algorithm based on a submodular objective function that can be efficiently optimized by greedy algorithm. By maximizing an information gain term over the graph, our submodular function favors a subset of database images that are similar to query images and resemble each other. The function also exploits the rank relationships of images from multiple ranked lists obtained by different features. We then study a more well-defined application, person re-identification, where the database contains labeled images of human bodies captured by multiple cameras. Re-identifications from multiple cameras are regarded as related tasks to exploit shared information. We apply a novel multi-task learning algorithm using both low level features and attributes. A low rank attribute embedding is joint learned within the multi-task learning formulation to embed original binary attributes to a continuous attribute space, where incorrect and incomplete attributes are rectified and recovered. To locate objects in images, we design an object detector based on object proposals and deep convolutional neural networks (CNN) in view of the emergence of deep networks. We improve a Fast RCNN framework and investigate two new strategies to detect objects accurately and efficiently: scale-dependent pooling (SDP) and cascaded rejection classifiers (CRC). The SDP improves detection accuracy by exploiting appropriate convolutional features depending on the scale of input object proposals. The CRC effectively utilizes convolutional features and greatly eliminates negative proposals in a cascaded manner, while maintaining a high recall for true objects. The two strategies together improve the detection accuracy and reduce the computational cost.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Increasing the size of training data in many computer vision tasks has shown to be very effective. Using large scale image datasets (e.g. ImageNet) with simple learning techniques (e.g. linear classifiers) one can achieve state-of-the-art performance in object recognition compared to sophisticated learning techniques on smaller image sets. Semantic search on visual data has become very popular. There are billions of images on the internet and the number is increasing every day. Dealing with large scale image sets is intense per se. They take a significant amount of memory that makes it impossible to process the images with complex algorithms on single CPU machines. Finding an efficient image representation can be a key to attack this problem. A representation being efficient is not enough for image understanding. It should be comprehensive and rich in carrying semantic information. In this proposal we develop an approach to computing binary codes that provide a rich and efficient image representation. We demonstrate several tasks in which binary features can be very effective. We show how binary features can speed up large scale image classification. We present learning techniques to learn the binary features from supervised image set (With different types of semantic supervision; class labels, textual descriptions). We propose several problems that are very important in finding and using efficient image representation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis proposes a generic visual perception architecture for robotic clothes perception and manipulation. This proposed architecture is fully integrated with a stereo vision system and a dual-arm robot and is able to perform a number of autonomous laundering tasks. Clothes perception and manipulation is a novel research topic in robotics and has experienced rapid development in recent years. Compared to the task of perceiving and manipulating rigid objects, clothes perception and manipulation poses a greater challenge. This can be attributed to two reasons: firstly, deformable clothing requires precise (high-acuity) visual perception and dexterous manipulation; secondly, as clothing approximates a non-rigid 2-manifold in 3-space, that can adopt a quasi-infinite configuration space, the potential variability in the appearance of clothing items makes them difficult to understand, identify uniquely, and interact with by machine. From an applications perspective, and as part of EU CloPeMa project, the integrated visual perception architecture refines a pre-existing clothing manipulation pipeline by completing pre-wash clothes (category) sorting (using single-shot or interactive perception for garment categorisation and manipulation) and post-wash dual-arm flattening. To the best of the author’s knowledge, as investigated in this thesis, the autonomous clothing perception and manipulation solutions presented here were first proposed and reported by the author. All of the reported robot demonstrations in this work follow a perception-manipulation method- ology where visual and tactile feedback (in the form of surface wrinkledness captured by the high accuracy depth sensor i.e. CloPeMa stereo head or the predictive confidence modelled by Gaussian Processing) serve as the halting criteria in the flattening and sorting tasks, respectively. From scientific perspective, the proposed visual perception architecture addresses the above challenges by parsing and grouping 3D clothing configurations hierarchically from low-level curvatures, through mid-level surface shape representations (providing topological descriptions and 3D texture representations), to high-level semantic structures and statistical descriptions. A range of visual features such as Shape Index, Surface Topologies Analysis and Local Binary Patterns have been adapted within this work to parse clothing surfaces and textures and several novel features have been devised, including B-Spline Patches with Locality-Constrained Linear coding, and Topology Spatial Distance to describe and quantify generic landmarks (wrinkles and folds). The essence of this proposed architecture comprises 3D generic surface parsing and interpretation, which is critical to underpinning a number of laundering tasks and has the potential to be extended to other rigid and non-rigid object perception and manipulation tasks. The experimental results presented in this thesis demonstrate that: firstly, the proposed grasp- ing approach achieves on-average 84.7% accuracy; secondly, the proposed flattening approach is able to flatten towels, t-shirts and pants (shorts) within 9 iterations on-average; thirdly, the proposed clothes recognition pipeline can recognise clothes categories from highly wrinkled configurations and advances the state-of-the-art by 36% in terms of classification accuracy, achieving an 83.2% true-positive classification rate when discriminating between five categories of clothes; finally the Gaussian Process based interactive perception approach exhibits a substantial improvement over single-shot perception. Accordingly, this thesis has advanced the state-of-the-art of robot clothes perception and manipulation.