979 resultados para Visual texture recognition
Resumo:
Au cours des 25 dernières années, les recherches sur le développement visuel chez l’humain à l’aide de l’électrophysiologie cérébrale et des potentiels évoqués visuels (PEV) ont permis d’explorer plusieurs fonctions associées au cortex visuel. Néanmoins, le développement de certaines d’entre elles (p. ex. segmentation des textures), tout comme les effets de la prématurité sur celles-ci, sont des aspects qui nécessitent d’être davantage étudiés. Par ailleurs, compte tenu de l’importance de la vision dans le développement de certaines fonctions cognitives (p. ex. lecture, visuomotricité), de plus en plus de recherches s’intéressent aux relations entre la vision et la cognition. Les objectifs généraux de la présente thèse étaient d’étudier le développement visuel chez les enfants nés à terme et nés prématurément à l’aide de l’électrophysiologie, puis de documenter les impacts de la prématurité sur le développement visuel et cognitif. Deux études ont été réalisées. La première visait à examiner, chez des enfants nés prématurément, le développement des voies visuelles primaires durant la première année de vie et en début de scolarisation, ainsi qu’à documenter leur profil cognitif et comportemental. À l’aide d’un devis semi-longitudinal, dix enfants nés prématurément ont été évalués à l’âge de six mois (âge corrigé) et à 7-8 ans en utilisant des PEV, et des épreuves cognitives et comportementales à l’âge scolaire. Leurs résultats ont été comparés à ceux de 10 enfants nés à terme appariés pour l’âge. À six mois, aucune différence de latence ou d’amplitude des ondes N1 et P1 n’a été trouvée entre les groupes. À l’âge scolaire, les enfants nés prématurément montraient, comparativement aux enfants nés à terme, une plus grande amplitude de N1 dans la condition P-préférentielle et dans celle co-stimulant les voies M et P, et de P1 (tendance) dans la condition M-préférentielle. Aucune différence n’a été trouvée entre les groupes aux mesures cognitives et comportementales. Ces résultats suggèrent qu’une naissance prématurée exerce un impact sur le développement des voies visuelles centrales. L’objectif de la seconde étude était de documenter le développement des processus de segmentation visuelle des textures durant la petite enfance chez des enfants nés à terme et nés prématurément à l’aide des PEV et d’un devis transversal. Quarante-cinq enfants nés à terme et 43 enfants nés prématurément ont été évalués à 12, 24 ou 36 mois (âge corrigé pour les prématurés à 12 et 24 mois). Les résultats indiquaient une diminution significative de la latence de la composante N2 entre 12 et 36 mois en réponse à l’orientation, à la texture et à la segmentation des textures, ainsi qu’une diminution significative d’amplitude pour l’orientation entre 12 et 24 mois, et pour la texture entre 12 et 24 mois, et 12 et 36 mois. Les comparaisons entre les enfants nés à terme et ceux nés prématurément démontraient une amplitude de N2 réduite chez ces derniers à 12 mois pour l’orientation et la texture. Bien que ces différences ne fussent plus apparentes à 24 mois, nos résultats semblent refléter un délai de maturation des processus visuel de bas et de plus haut niveau chez les enfants nés prématurément, du moins, pendant la petite enfance. En conclusion, nos résultats indiquent que la prématurité, même sans atteinte neurologique importante, altère le développement des fonctions visuelles à certaines périodes du développement et mettent en évidence l’importance d’en investiguer davantage les impacts (p. ex. cognitifs, comportementaux, scolaires) à moyen et long-terme.
Resumo:
abstract With many visual speech animation techniques now available, there is a clear need for systematic perceptual evaluation schemes. We describe here our scheme and its application to a new video-realistic (potentially indistinguishable from real recorded video) visual-speech animation system, called Mary 101. Two types of experiments were performed: a) distinguishing visually between real and synthetic image- sequences of the same utterances, ("Turing tests") and b) gauging visual speech recognition by comparing lip-reading performance of the real and synthetic image-sequences of the same utterances ("Intelligibility tests"). Subjects that were presented randomly with either real or synthetic image-sequences could not tell the synthetic from the real sequences above chance level. The same subjects when asked to lip-read the utterances from the same image-sequences recognized speech from real image-sequences significantly better than from synthetic ones. However, performance for both, real and synthetic, were at levels suggested in the literature on lip-reading. We conclude from the two experiments that the animation of Mary 101 is adequate for providing a percept of a talking head. However, additional effort is required to improve the animation for lip-reading purposes like rehabilitation and language learning. In addition, these two tasks could be considered as explicit and implicit perceptual discrimination tasks. In the explicit task (a), each stimulus is classified directly as a synthetic or real image-sequence by detecting a possible difference between the synthetic and the real image-sequences. The implicit perceptual discrimination task (b) consists of a comparison between visual recognition of speech of real and synthetic image-sequences. Our results suggest that implicit perceptual discrimination is a more sensitive method for discrimination between synthetic and real image-sequences than explicit perceptual discrimination.
Resumo:
Changes in the angle of illumination incident upon a 3D surface texture can significantly alter its appearance, implying variations in the image texture. These texture variations produce displacements of class members in the feature space, increasing the failure rates of texture classifiers. To avoid this problem, a model-based texture recognition system which classifies textures seen from different distances and under different illumination directions is presented in this paper. The system works on the basis of a surface model obtained by means of 4-source colour photometric stereo, used to generate 2D image textures under different illumination directions. The recognition system combines coocurrence matrices for feature extraction with a Nearest Neighbour classifier. Moreover, the recognition allows one to guess the approximate direction of the illumination used to capture the test image
Resumo:
Background: Few studies have investigated how individuals diagnosed with post-stroke Broca’s aphasia decompose words into their constituent morphemes in real-time processing. Previous research has focused on morphologically complex words in non-time-constrained settings or in syntactic frames, but not in the lexicon. Aims: We examined real-time processing of morphologically complex words in a group of five Greek-speaking individuals with Broca’s aphasia to determine: (1) whether their morphological decomposition mechanisms are sensitive to lexical (orthography and frequency) vs. morphological (stem-suffix combinatory features) factors during visual word recognition, (2) whether these mechanisms are different in inflected vs. derived forms during lexical access, and (3) whether there is a preferred unit of lexical access (syllables vs. morphemes) for inflected vs. derived forms. Methods & Procedures: The study included two real-time experiments. The first was a semantic judgment task necessitating participants’ categorical judgments for high- and low-frequency inflected real words and pseudohomophones of the real words created by either an orthographic error at the stem or a homophonous (but incorrect) inflectional suffix. The second experiment was a letter-priming task at the syllabic or morphemic boundary of morphologically transparent inflected and derived words whose stems and suffixes were matched for length, lemma and surface frequency. Outcomes & Results: The majority of the individuals with Broca’s aphasia were sensitive to lexical frequency and stem orthography, while ignoring the morphological combinatory information encoded in the inflectional suffix that control participants were sensitive to. The letter-priming task, on the other hand, showed that individuals with aphasia—in contrast to controls—showed preferences with regard to the unit of lexical access, i.e., they were overall faster on syllabically than morphemically parsed words and their morphological decomposition mechanisms for inflected and derived forms were modulated by the unit of lexical access. Conclusions: Our results show that in morphological processing, Greek-speaking persons with aphasia rely mainly on stem access and thus are only sensitive to orthographic violations of the stem morphemes, but not to illegal morphological combinations of stems and suffixes. This possibly indicates an intact orthographic lexicon but deficient morphological decomposition mechanisms, possibly stemming from an underspecification of inflectional suffixes in the participants’ grammar. Syllabic information, however, appears to facilitate lexical access and elicits repair mechanisms that compensate for deviant morphological parsing procedures.
Resumo:
Given the widespread use of computers, the visual pattern recognition task has been automated in order to address the huge amount of available digital images. Many applications use image processing techniques as well as feature extraction and visual pattern recognition algorithms in order to identify people, to make the disease diagnosis process easier, to classify objects, etc. based on digital images. Among the features that can be extracted and analyzed from images is the shape of objects or regions. In some cases, shape is the unique feature that can be extracted with a relatively high accuracy from the image. In this work we present some of most important shape analysis methods and compare their performance when applied on three well-known shape image databases. Finally, we propose the development of a new shape descriptor based on the Hough Transform.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Neuronal models predict that retrieval of specific event information reactivates brain regions that were active during encoding of this information. Consistent with this prediction, this positron-emission tomography study showed that remembering that visual words had been paired with sounds at encoding activated some of the auditory brain regions that were engaged during encoding. After word-sound encoding, activation of auditory brain regions was also observed during visual word recognition when there was no demand to retrieve auditory information. Collectively, these observations suggest that information about the auditory components of multisensory event information is stored in auditory responsive cortex and reactivated at retrieval, in keeping with classical ideas about “redintegration,” that is, the power of part of an encoded stimulus complex to evoke the whole experience.
Resumo:
The specificity of the improvement in perceptual learning is often used to localize the neuronal changes underlying this type of adult plasticity. We investigated a visual texture discrimination task previously reported to be accomplished preattentively and for which learning-related changes were inferred to occur at a very early level of the visual processing stream. The stimulus was a matrix of lines from which a target popped out, due to an orientation difference between the three target lines and the background lines. The task was to report the global orientation of the target and was performed monocularly. The subjects' performance improved dramatically with training over the course of 2-3 weeks, after which we tested the specificity of the improvement for the eye trained. In all subjects tested, there was complete interocular transfer of the learning effect. The neuronal correlate of this learning are therefore most likely localized in a visual area where input from the two eyes has come together.
Resumo:
One hundred and twelve university students completed 7 tests assessing word-reading accuracy, print exposure, phonological sensitivity, phonological coding and knowledge of English morphology as predictors of spelling accuracy. Together the tests accounted for 71% of the variance in spelling, with phonological skills and morphological knowledge emerging as strong predictors of spelling accuracy for words with both regular and irregular sound-spelling correspondences. The pattern of relationships was consistent with a model in which, as a function of the learning opportunities that are provided by reading experience, phonological skills promote the learning of individual word orthographies and structural relationships among words.
Resumo:
Background - It is well established that the left inferior frontal gyrus plays a key role in the cerebral cortical network that supports reading and visual word recognition. Less clear is when in time this contribution begins. We used magnetoencephalography (MEG), which has both good spatial and excellent temporal resolution, to address this question. Methodology/Principal Findings - MEG data were recorded during a passive viewing paradigm, chosen to emphasize the stimulus-driven component of the cortical response, in which right-handed participants were presented words, consonant strings, and unfamiliar faces to central vision. Time-frequency analyses showed a left-lateralized inferior frontal gyrus (pars opercularis) response to words between 100–250 ms in the beta frequency band that was significantly stronger than the response to consonant strings or faces. The left inferior frontal gyrus response to words peaked at ~130 ms. This response was significantly later in time than the left middle occipital gyrus, which peaked at ~115 ms, but not significantly different from the peak response in the left mid fusiform gyrus, which peaked at ~140 ms, at a location coincident with the fMRI–defined visual word form area (VWFA). Significant responses were also detected to words in other parts of the reading network, including the anterior middle temporal gyrus, the left posterior middle temporal gyrus, the angular and supramarginal gyri, and the left superior temporal gyrus. Conclusions/Significance - These findings suggest very early interactions between the vision and language domains during visual word recognition, with speech motor areas being activated at the same time as the orthographic word-form is being resolved within the fusiform gyrus. This challenges the conventional view of a temporally serial processing sequence for visual word recognition in which letter forms are initially decoded, interact with their phonological and semantic representations, and only then gain access to a speech code.
Resumo:
To represent the local orientation and energy of a 1-D image signal, many models of early visual processing employ bandpass quadrature filters, formed by combining the original signal with its Hilbert transform. However, representations capable of estimating an image signal's 2-D phase have been largely ignored. Here, we consider 2-D phase representations using a method based upon the Riesz transform. For spatial images there exist two Riesz transformed signals and one original signal from which orientation, phase and energy may be represented as a vector in 3-D signal space. We show that these image properties may be represented by a Singular Value Decomposition (SVD) of the higher-order derivatives of the original and the Riesz transformed signals. We further show that the expected responses of even and odd symmetric filters from the Riesz transform may be represented by a single signal autocorrelation function, which is beneficial in simplifying Bayesian computations for spatial orientation. Importantly, the Riesz transform allows one to weight linearly across orientation using both symmetric and asymmetric filters to account for some perceptual phase distortions observed in image signals - notably one's perception of edge structure within plaid patterns whose component gratings are either equal or unequal in contrast. Finally, exploiting the benefits that arise from the Riesz definition of local energy as a scalar quantity, we demonstrate the utility of Riesz signal representations in estimating the spatial orientation of second-order image signals. We conclude that the Riesz transform may be employed as a general tool for 2-D visual pattern recognition by its virtue of representing phase, orientation and energy as orthogonal signal quantities.
Resumo:
Brain-computer interfaces (BCI) have the potential to restore communication or control abilities in individuals with severe neuromuscular limitations, such as those with amyotrophic lateral sclerosis (ALS). The role of a BCI is to extract and decode relevant information that conveys a user's intent directly from brain electro-physiological signals and translate this information into executable commands to control external devices. However, the BCI decision-making process is error-prone due to noisy electro-physiological data, representing the classic problem of efficiently transmitting and receiving information via a noisy communication channel.
This research focuses on P300-based BCIs which rely predominantly on event-related potentials (ERP) that are elicited as a function of a user's uncertainty regarding stimulus events, in either an acoustic or a visual oddball recognition task. The P300-based BCI system enables users to communicate messages from a set of choices by selecting a target character or icon that conveys a desired intent or action. P300-based BCIs have been widely researched as a communication alternative, especially in individuals with ALS who represent a target BCI user population. For the P300-based BCI, repeated data measurements are required to enhance the low signal-to-noise ratio of the elicited ERPs embedded in electroencephalography (EEG) data, in order to improve the accuracy of the target character estimation process. As a result, BCIs have relatively slower speeds when compared to other commercial assistive communication devices, and this limits BCI adoption by their target user population. The goal of this research is to develop algorithms that take into account the physical limitations of the target BCI population to improve the efficiency of ERP-based spellers for real-world communication.
In this work, it is hypothesised that building adaptive capabilities into the BCI framework can potentially give the BCI system the flexibility to improve performance by adjusting system parameters in response to changing user inputs. The research in this work addresses three potential areas for improvement within the P300 speller framework: information optimisation, target character estimation and error correction. The visual interface and its operation control the method by which the ERPs are elicited through the presentation of stimulus events. The parameters of the stimulus presentation paradigm can be modified to modulate and enhance the elicited ERPs. A new stimulus presentation paradigm is developed in order to maximise the information content that is presented to the user by tuning stimulus paradigm parameters to positively affect performance. Internally, the BCI system determines the amount of data to collect and the method by which these data are processed to estimate the user's target character. Algorithms that exploit language information are developed to enhance the target character estimation process and to correct erroneous BCI selections. In addition, a new model-based method to predict BCI performance is developed, an approach which is independent of stimulus presentation paradigm and accounts for dynamic data collection. The studies presented in this work provide evidence that the proposed methods for incorporating adaptive strategies in the three areas have the potential to significantly improve BCI communication rates, and the proposed method for predicting BCI performance provides a reliable means to pre-assess BCI performance without extensive online testing.
Resumo:
Die Fähigkeit, geschriebene Texte zu verstehen, d.h. eine kohärente mentale Repräsentation von Textinhalten zu erstellen, ist eine notwendige Voraussetzung für eine erfolgreiche schulische und außerschulische Entwicklung. Es ist daher ein zentrales Anliegen des Bildungssystems Leseschwierigkeiten frühzeitig zu diagnostizieren und mithilfe zielgerichteter Interventionsprogramme zu fördern. Dies erfordert ein umfassendes Wissen über die kognitiven Teilprozesse, die dem Leseverstehen zugrunde liegen, ihre Zusammenhänge und ihre Entwicklung. Die vorliegende Dissertation soll zu einem umfassenden Verständnis über das Leseverstehen beitragen, indem sie eine Auswahl offener Fragestellungen experimentell untersucht. Studie 1 untersucht inwieweit phonologische Rekodier- und orthographische Dekodierfertigkeiten zum Satz- und Textverstehen beitragen und wie sich beide Fertigkeiten bei deutschen Grundschüler(inne)n von der 2. bis zur 4. Klasse entwickeln. Die Ergebnisse legen nahe, dass beide Fertigkeiten signifikante und eigenständige Beiträge zum Leseverstehen leisten und dass sich ihr relativer Beitrag über die Klassenstufen hinweg nicht verändert. Darüber hinaus zeigt sich, dass bereits deutsche Zweitklässler(innen) den Großteil geschriebener Wörter in altersgerechten Texten über orthographische Vergleichsprozesse erkennen. Nichtsdestotrotz nutzen deutsche Grundschulkinder offenbar kontinuierlich phonologische Informationen, um die visuelle Worterkennung zu optimieren. Studie 2 erweitert die bisherige empirische Forschung zu einem der bekanntesten Modelle des Leseverstehens—der Simple View of Reading (SVR, Gough & Tunmer, 1986). Die Studie überprüft die SVR (Reading comprehension = Decoding x Comprehension) mithilfe optimierter und methodisch stringenter Maße der Modellkonstituenten und überprüft ihre Generalisierbarkeit für deutsche Dritt- und Viertklässler(innen). Studie 2 zeigt, dass die SVR einer methodisch stringenten Überprüfung nicht standhält und nicht ohne Weiteres auf deutsche Dritt- und Viertklässler(innen) generalisiert werden kann. Es wurden nur schwache Belege für eine multiplikative Verknüpfung von Dekodier- (D) und Hörverstehensfertigkeiten (C) gefunden. Der Umstand, dass ein beachtlicher Teil der Varianz im Leseverstehen (R) nicht durch D und C aufgeklärt werden konnte, deutet darauf hin, dass das Modell nicht vollständig ist und ggf. durch weitere Komponenten ergänzt werden muss. Studie 3 untersucht die Verarbeitung positiv-kausaler und negativ-kausaler Kohärenzrelationen bei deutschen Erst- bis Viertklässler(inne)n und Erwachsenen im Lese- und Hörverstehen. In Übereinstimmung mit dem Cumulative Cognitive Complexity-Ansatz (Evers-Vermeul & Sanders, 2009; Spooren & Sanders, 2008) zeigt Studie 3, dass die Verarbeitung negativ-kausaler Kohärenzrelationen und Konnektoren kognitiv aufwändiger ist als die Verarbeitung positiv-kausaler Relationen. Darüber hinaus entwickelt sich das Verstehen beider Kohärenzrelationen noch über die Grundschulzeit hinweg und ist für negativ-kausale Relationen am Ende der vierten Klasse noch nicht abgeschlossen. Studie 4 zeigt und diskutiert die Nützlichkeit prozess-orientierter Lesetests wie ProDi- L (Richter et al., in press), die individuelle Unterschiede in den kognitiven Teilfertigkeiten des Leseverstehens selektiv erfassen. Hierzu wird exemplarisch die Konstruktvalidität des ProDi-L-Subtests ‚Syntaktische Integration’ nachgewiesen. Mittels explanatorischer Item- Repsonse-Modelle wird gezeigt, dass der Test Fertigkeiten syntaktischer Integration separat erfasst und Kinder mit defizitären syntaktischen Fertigkeiten identifizieren kann. Die berichteten Befunde tragen zu einem umfassenden Verständnis der kognitiven Teilfertigkeiten des Leseverstehens bei, das für eine optimale Gestaltung des Leseunterrichts, für das Erstellen von Lernmaterialien, Leseinstruktionen und Lehrbüchern unerlässlich ist. Darüber hinaus stellt es die Grundlage für eine sinnvolle Diagnose individueller Leseschwierigkeiten und für die Konzeption adaptiver und zielgerichteter Interventionsprogramme zur Förderung des Leseverstehens bei schwachen Leser(inne)n dar.
Resumo:
During grasping and intelligent robotic manipulation tasks, the camera position relative to the scene changes dramatically because the robot is moving to adapt its path and correctly grasp objects. This is because the camera is mounted at the robot effector. For this reason, in this type of environment, a visual recognition system must be implemented to recognize and “automatically and autonomously” obtain the positions of objects in the scene. Furthermore, in industrial environments, all objects that are manipulated by robots are made of the same material and cannot be differentiated by features such as texture or color. In this work, first, a study and analysis of 3D recognition descriptors has been completed for application in these environments. Second, a visual recognition system designed from specific distributed client-server architecture has been proposed to be applied in the recognition process of industrial objects without these appearance features. Our system has been implemented to overcome problems of recognition when the objects can only be recognized by geometric shape and the simplicity of shapes could create ambiguity. Finally, some real tests are performed and illustrated to verify the satisfactory performance of the proposed system.
Resumo:
Magdeburg, Univ., Fak. für Informatik, Diss., 2014