Biblioteca Digital

995 resultados para perception processes

The Perception of Natural, Cell Phone, and Computer-Synthesized Speech During The Performance Of Simultaneous Visual-Motor Tasks

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study investigated the influence of top-down and bottom-up information on speech perception in complex listening environments. Specifically, the effects of listening to different types of processed speech were examined on intelligibility and on simultaneous visual-motor performance. The goal was to extend the generalizability of results in speech perception to environments outside of the laboratory. The effect of bottom-up information was evaluated with natural, cell phone and synthetic speech. The effect of simultaneous tasks was evaluated with concurrent visual-motor and memory tasks. Earlier works on the perception of speech during simultaneous visual-motor tasks have shown inconsistent results (Choi, 2004; Strayer & Johnston, 2001). In the present experiments, two dual-task paradigms were constructed in order to mimic non-laboratory listening environments. In the first two experiments, an auditory word repetition task was the primary task and a visual-motor task was the secondary task. Participants were presented with different kinds of speech in a background of multi-speaker babble and were asked to repeat the last word of every sentence while doing the simultaneous tracking task. Word accuracy and visual-motor task performance were measured. Taken together, the results of Experiments 1 and 2 showed that the intelligibility of natural speech was better than synthetic speech and that synthetic speech was better perceived than cell phone speech. The visual-motor methodology was found to demonstrate independent and supplemental information and provided a better understanding of the entire speech perception process. Experiment 3 was conducted to determine whether the automaticity of the tasks (Schneider & Shiffrin, 1977) helped to explain the results of the first two experiments. It was found that cell phone speech allowed better simultaneous pursuit rotor performance only at low intelligibility levels when participants ignored the listening task. Also, simultaneous task performance improved dramatically for natural speech when intelligibility was good. Overall, it could be concluded that knowledge of intelligibility alone is insufficient to characterize processing of different speech sources. Additional measures such as attentional demands and performance of simultaneous tasks were also important in characterizing the perception of different kinds of speech in complex listening environments.

Mathematical models of cognitive processes

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The research activity carried out during the PhD course was focused on the development of mathematical models of some cognitive processes and their validation by means of data present in literature, with a double aim: i) to achieve a better interpretation and explanation of the great amount of data obtained on these processes from different methodologies (electrophysiological recordings on animals, neuropsychological, psychophysical and neuroimaging studies in humans), ii) to exploit model predictions and results to guide future research and experiments. In particular, the research activity has been focused on two different projects: 1) the first one concerns the development of neural oscillators networks, in order to investigate the mechanisms of synchronization of the neural oscillatory activity during cognitive processes, such as object recognition, memory, language, attention; 2) the second one concerns the mathematical modelling of multisensory integration processes (e.g. visual-acoustic), which occur in several cortical and subcortical regions (in particular in a subcortical structure named Superior Colliculus (SC)), and which are fundamental for orienting motor and attentive responses to external world stimuli. This activity has been realized in collaboration with the Center for Studies and Researches in Cognitive Neuroscience of the University of Bologna (in Cesena) and the Department of Neurobiology and Anatomy of the Wake Forest University School of Medicine (NC, USA). PART 1. Objects representation in a number of cognitive functions, like perception and recognition, foresees distribute processes in different cortical areas. One of the main neurophysiological question concerns how the correlation between these disparate areas is realized, in order to succeed in grouping together the characteristics of the same object (binding problem) and in maintaining segregated the properties belonging to different objects simultaneously present (segmentation problem). Different theories have been proposed to address these questions (Barlow, 1972). One of the most influential theory is the so called “assembly coding”, postulated by Singer (2003), according to which 1) an object is well described by a few fundamental properties, processing in different and distributed cortical areas; 2) the recognition of the object would be realized by means of the simultaneously activation of the cortical areas representing its different features; 3) groups of properties belonging to different objects would be kept separated in the time domain. In Chapter 1.1 and in Chapter 1.2 we present two neural network models for object recognition, based on the “assembly coding” hypothesis. These models are networks of Wilson-Cowan oscillators which exploit: i) two high-level “Gestalt Rules” (the similarity and previous knowledge rules), to realize the functional link between elements of different cortical areas representing properties of the same object (binding problem); 2) the synchronization of the neural oscillatory activity in the γ-band (30-100Hz), to segregate in time the representations of different objects simultaneously present (segmentation problem). These models are able to recognize and reconstruct multiple simultaneous external objects, even in difficult case (some wrong or lacking features, shared features, superimposed noise). In Chapter 1.3 the previous models are extended to realize a semantic memory, in which sensory-motor representations of objects are linked with words. To this aim, the network, previously developed, devoted to the representation of objects as a collection of sensory-motor features, is reciprocally linked with a second network devoted to the representation of words (lexical network) Synapses linking the two networks are trained via a time-dependent Hebbian rule, during a training period in which individual objects are presented together with the corresponding words. Simulation results demonstrate that, during the retrieval phase, the network can deal with the simultaneous presence of objects (from sensory-motor inputs) and words (from linguistic inputs), can correctly associate objects with words and segment objects even in the presence of incomplete information. Moreover, the network can realize some semantic links among words representing objects with some shared features. These results support the idea that semantic memory can be described as an integrated process, whose content is retrieved by the co-activation of different multimodal regions. In perspective, extended versions of this model may be used to test conceptual theories, and to provide a quantitative assessment of existing data (for instance concerning patients with neural deficits). PART 2. The ability of the brain to integrate information from different sensory channels is fundamental to perception of the external world (Stein et al, 1993). It is well documented that a number of extraprimary areas have neurons capable of such a task; one of the best known of these is the superior colliculus (SC). This midbrain structure receives auditory, visual and somatosensory inputs from different subcortical and cortical areas, and is involved in the control of orientation to external events (Wallace et al, 1993). SC neurons respond to each of these sensory inputs separately, but is also capable of integrating them (Stein et al, 1993) so that the response to the combined multisensory stimuli is greater than that to the individual component stimuli (enhancement). This enhancement is proportionately greater if the modality-specific paired stimuli are weaker (the principle of inverse effectiveness). Several studies have shown that the capability of SC neurons to engage in multisensory integration requires inputs from cortex; primarily the anterior ectosylvian sulcus (AES), but also the rostral lateral suprasylvian sulcus (rLS). If these cortical inputs are deactivated the response of SC neurons to cross-modal stimulation is no different from that evoked by the most effective of its individual component stimuli (Jiang et al 2001). This phenomenon can be better understood through mathematical models. The use of mathematical models and neural networks can place the mass of data that has been accumulated about this phenomenon and its underlying circuitry into a coherent theoretical structure. In Chapter 2.1 a simple neural network model of this structure is presented; this model is able to reproduce a large number of SC behaviours like multisensory enhancement, multisensory and unisensory depression, inverse effectiveness. In Chapter 2.2 this model was improved by incorporating more neurophysiological knowledge about the neural circuitry underlying SC multisensory integration, in order to suggest possible physiological mechanisms through which it is effected. This endeavour was realized in collaboration with Professor B.E. Stein and Doctor B. Rowland during the 6 months-period spent at the Department of Neurobiology and Anatomy of the Wake Forest University School of Medicine (NC, USA), within the Marco Polo Project. The model includes four distinct unisensory areas that are devoted to a topological representation of external stimuli. Two of them represent subregions of the AES (i.e., FAES, an auditory area, and AEV, a visual area) and send descending inputs to the ipsilateral SC; the other two represent subcortical areas (one auditory and one visual) projecting ascending inputs to the same SC. Different competitive mechanisms, realized by means of population of interneurons, are used in the model to reproduce the different behaviour of SC neurons in conditions of cortical activation and deactivation. The model, with a single set of parameters, is able to mimic the behaviour of SC multisensory neurons in response to very different stimulus conditions (multisensory enhancement, inverse effectiveness, within- and cross-modal suppression of spatially disparate stimuli), with cortex functional and cortex deactivated, and with a particular type of membrane receptors (NMDA receptors) active or inhibited. All these results agree with the data reported in Jiang et al. (2001) and in Binns and Salt (1996). The model suggests that non-linearities in neural responses and synaptic (excitatory and inhibitory) connections can explain the fundamental aspects of multisensory integration, and provides a biologically plausible hypothesis about the underlying circuitry.

Neuroelectromagnetic correlates of perceptual closure processes

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Perceptual closure refers to the coherent perception of an object under circumstances when the visual information is incomplete. Although the perceptual closure index observed in electroencephalography reflects that an object has been recognized, the full spatiotemporal dynamics of cortical source activity underlying perceptual closure processing remain unknown so far. To address this question, we recorded magnetoencephalographic activity in 15 subjects (11 females) during a visual closure task and performed beamforming over a sequence of successive short time windows to localize high-frequency gamma-band activity (60–100 Hz). Two-tone images of human faces (Mooney faces) were used to examine perceptual closure. Event-related fields exhibited a magnetic closure index between 250 and 325 ms. Time-frequency analyses revealed sustained high-frequency gamma-band activity associated with the processing of Mooney stimuli; closure-related gamma-band activity was observed between 200 and 300 ms over occipitotemporal channels. Time-resolved source reconstruction revealed an early (0–200 ms) coactivation of caudal inferior temporal gyrus (cITG) and regions in posterior parietal cortex (PPC). At the time of perceptual closure (200–400 ms), the activation in cITG extended to the fusiform gyrus, if a face was perceived. Our data provide the first electrophysiological evidence that perceptual closure for Mooney faces starts with an interaction between areas related to processing of three-dimensional structure from shading cues (cITG) and areas associated with the activation of long-term memory templates (PPC). Later, at the moment of perceptual closure, inferior temporal cortex areas specialized for the perceived object are activated, i.e., the fusiform gyrus related to face processing for Mooney stimuli.

Hearing in the mind's ear: A PET investigation of musical imagery and perception

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Neuropsychological studies have suggested that imagery processes may be mediated by neuronal mechanisms similar to those used in perception. To test this hypothesis, and to explore the neural basis for song imagery, 12 normal subjects were scanned using the water bolus method to measure cerebral blood flow (CBF) during the performance of three tasks. In the control condition subjects saw pairs of words on each trial and judged which word was longer. In the perceptual condition subjects also viewed pairs of words, this time drawn from a familiar song; simultaneously they heard the corresponding song, and their task was to judge the change in pitch of the two cued words within the song. In the imagery condition, subjects performed precisely the same judgment as in the perceptual condition, but with no auditory input. Thus, to perform the imagery task correctly an internal auditory representation must be accessed. Paired-image subtraction of the resulting pattern of CBF, together with matched MRI for anatomical localization, revealed that both perceptual and imagery. tasks produced similar patterns of CBF changes, as compared to the control condition, in keeping with the hypothesis. More specifically, both perceiving and imagining songs are associated with bilateral neuronal activity in the secondary auditory cortices, suggesting that processes within these regions underlie the phenomenological impression of imagined sounds. Other CBF foci elicited in both tasks include areas in the left and right frontal lobes and in the left parietal lobe, as well as the supplementary motor area. This latter region implicates covert vocalization as one component of musical imagery. Direct comparison of imagery and perceptual tasks revealed CBF increases in the inferior frontal polar cortex and right thalamus. We speculate that this network of regions may be specifically associated with retrieval and/or generation of auditory information from memory.

The feeling of fluent perception: A single experience from multiple asynchronous sources

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Zeki and co-workers recently proposed that perception can best be described as locally distributed, asynchronous processes that each create a kind of microconsciousness, which condense into an experienced percept. The present article is aimed at extending this theory to metacognitive feelings. We present evidence that perceptual fluency-the subjective feeling of ease during perceptual processing-is based on speed of processing at different stages of the perceptual process. Specifically, detection of briefly presented stimuli was influenced by figure-ground contrast, but not by symmetry (Experiment 1) or the font (Experiment 2) of the stimuli. Conversely, discrimination of these stimuli was influenced by whether they were symmetric (Experiment 1) and by the font they were presented in (Experiment 2), but not by figure-ground contrast. Both tasks however were related with the subjective experience of fluency (Experiments 1 and 2). We conclude that subjective fluency is the conscious phenomenal correlate of different processing stages in visual perception.

Complex visual hallucinations and attentional performance in eye disease and dementia: a test of the Perception and Attention Deficit model

Relevância:

30.00% 30.00%

Publicador:

Resumo:

OBJECTIVE This study aimed to test the prediction from the Perception and Attention Deficit model of complex visual hallucinations (CVH) that impairments in visual attention and perception are key risk factors for complex hallucinations in eye disease and dementia. METHODS Two studies ran concurrently to investigate the relationship between CVH and impairments in perception (picture naming using the Graded Naming Test) and attention (Stroop task plus a novel Imagery task). The studies were in two populations-older patients with dementia (n = 28) and older people with eye disease (n = 50) with a shared control group (n = 37). The same methodology was used in both studies, and the North East Visual Hallucinations Inventory was used to identify CVH. RESULTS A reliable relationship was found for older patients with dementia between impaired perceptual and attentional performance and CVH. A reliable relationship was not found in the population of people with eye disease. CONCLUSIONS The results add to previous research that object perception and attentional deficits are associated with CVH in dementia, but that risk factors for CVH in eye disease are inconsistent, suggesting that dynamic rather than static impairments in attentional processes may be key in this population.

Spatial cognition, body representation and affective processes: the role of vestibular information beyond ocular reflexes and control of posture

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A growing number of studies in humans demonstrate the involvement of vestibular information in tasks that are seemingly remote from well-known functions such as space constancy or postural control. In this review article we point out three emerging streams of research highlighting the importance of vestibular input: (1) Spatial Cognition: Modulation of vestibular signals can induce specific changes in spatial cognitive tasks like mental imagery and the processing of numbers. This has been shown in studies manipulating body orientation (changing the input from the otoliths), body rotation (changing the input from the semicircular canals), in clinical findings with vestibular patients, and in studies carried out in microgravity. There is also an effect in the reverse direction; top-down processes can affect perception of vestibular stimuli. (2) Body Representation: Numerous studies demonstrate that vestibular stimulation changes the representation of body parts, and sensitivity to tactile input or pain. Thus, the vestibular system plays an integral role in multisensory coordination of body representation. (3) Affective Processes and Disorders: Studies in psychiatric patients and patients with a vestibular disorder report a high comorbidity of vestibular dysfunctions and psychiatric symptoms. Recent studies investigated the beneficial effect of vestibular stimulation on psychiatric disorders, and how vestibular input can change mood and affect. These three emerging streams of research in vestibular science are—at least in part—associated with different neuronal core mechanisms. Spatial transformations draw on parietal areas, body representation is associated with somatosensory areas, and affective processes involve insular and cingulate cortices, all of which receive vestibular input. Even though a wide range of different vestibular cortical projection areas has been ascertained, their functionality still is scarcely understood.

In search of the internal structure of the processes underlying interval timing in the sub-second and the second range: A confirmatory factor analysis approach

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One of the earliest accounts of duration perception by Karl von Vierordt implied a common process underlying the timing of intervals in the sub-second and the second range. To date, there are two major explanatory approaches for the timing of brief intervals: the Common Timing Hypothesis and the Distinct Timing Hypothesis. While the common timing hypothesis also proceeds from a unitary timing process, the distinct timing hypothesis suggests two dissociable, independent mechanisms for the timing of intervals in the sub-second and the second range, respectively. In the present paper, we introduce confirmatory factor analysis (CFA) to elucidate the internal structure of interval timing in the sub-second and the second range. Our results indicate that the assumption of two mechanisms underlying the processing of intervals in the second and the sub-second range might be more appropriate than the assumption of a unitary timing mechanism. In contrast to the basic assumption of the distinct timing hypothesis, however, these two timing mechanisms are closely associated with each other and share 77% of common variance. This finding suggests either a strong functional relationship between the two timing mechanisms or a hierarchically organized internal structure. Findings are discussed in the light of existing psychophysical and neurophysiological data.

Crop row detection in maize fields inspired on the human visual perception

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes a new method, oriented to image real-time processing, for identifying crop rows in maize fields in the images. The vision system is designed to be installed onboard a mobile agricultural vehicle, that is, submitted to gyros, vibrations, and undesired movements. The images are captured under image perspective, being affected by the above undesired effects. The image processing consists of two main processes: image segmentation and crop row detection. The first one applies a threshold to separate green plants or pixels (crops and weeds) from the rest (soil, stones, and others). It is based on a fuzzy clustering process, which allows obtaining the threshold to be applied during the normal operation process. The crop row detection applies a method based on image perspective projection that searches for maximum accumulation of segmented green pixels along straight alignments. They determine the expected crop lines in the images. The method is robust enough to work under the above-mentioned undesired effects. It is favorably compared against the well-tested Hough transformation for line detection.

Cognitive risk perception system for obstacle avoidance in outdoor mUAV missions

Relevância:

30.00% 30.00%

Publicador:

Resumo:

La robótica ha evolucionado exponencialmente en las últimas décadas, permitiendo a los sistemas actuales realizar tareas sumamente complejas con gran precisión, fiabilidad y velocidad. Sin embargo, este desarrollo ha estado asociado a un mayor grado de especialización y particularización de las tecnologías implicadas, siendo estas muy eficientes en situaciones concretas y controladas, pero incapaces en entornos cambiantes, dinámicos y desestructurados. Por eso, el desarrollo de la robótica debe pasar por dotar a los sistemas de capacidad de adaptación a las circunstancias, de entendedimiento sobre los cambios observados y de flexibilidad a la hora de interactuar con el entorno. Estas son las caracteristicas propias de la interacción del ser humano con su entorno, las que le permiten sobrevivir y las que pueden proporcionar a un sistema inteligencia y capacidad suficientes para desenvolverse en un entorno real de forma autónoma e independiente. Esta adaptabilidad es especialmente importante en el manejo de riesgos e incetidumbres, puesto que es el mecanismo que permite contextualizar y evaluar las amenazas para proporcionar una respuesta adecuada. Así, por ejemplo, cuando una persona se mueve e interactua con su entorno, no evalúa los obstáculos en función de su posición, velocidad o dinámica (como hacen los sistemas robóticos tradicionales), sino mediante la estimación del riesgo potencial que estos elementos suponen para la persona. Esta evaluación se consigue combinando dos procesos psicofísicos del ser humano: por un lado, la percepción humana analiza los elementos relevantes del entorno, tratando de entender su naturaleza a partir de patrones de comportamiento, propiedades asociadas u otros rasgos distintivos. Por otro lado, como segundo nivel de evaluación, el entendimiento de esta naturaleza permite al ser humano conocer/estimar la relación de los elementos con él mismo, así como sus implicaciones en cuanto a nivel de riesgo se refiere. El establecimiento de estas relaciones semánticas -llamado cognición- es la única forma de definir el nivel de riesgo de manera absoluta y de generar una respuesta adecuada al mismo. No necesariamente proporcional, sino coherente con el riesgo al que se enfrenta. La investigación que presenta esta tesis describe el trabajo realizado para trasladar esta metodología de análisis y funcionamiento a la robótica. Este se ha centrado especialmente en la nevegación de los robots aéreos, diseñando e implementado procedimientos de inspiración humana para garantizar la seguridad de la misma. Para ello se han estudiado y evaluado los mecanismos de percepción, cognición y reacción humanas en relación al manejo de riesgos. También se ha analizado como los estímulos son capturados, procesados y transformados por condicionantes psicológicos, sociológicos y antropológicos de los seres humanos. Finalmente, también se ha analizado como estos factores motivan y descandenan las reacciones humanas frente a los peligros. Como resultado de este estudio, todos estos procesos, comportamientos y condicionantes de la conducta humana se han reproducido en un framework que se ha estructurado basadandose en factores análogos. Este emplea el conocimiento obtenido experimentalmente en forma de algoritmos, técnicas y estrategias, emulando el comportamiento humano en las mismas circunstancias. Diseñado, implementeado y validado tanto en simulación como con datos reales, este framework propone una manera innovadora -tanto en metodología como en procedimiento- de entender y reaccionar frente a las amenazas potenciales de una misión robótica. ABSTRACT Robotics has undergone a great revolution in the last decades. Nowadays this technology is able to perform really complex tasks with a high degree of accuracy and speed, however this is only true in precisely defined situations with fully controlled variables. Since the real world is dynamic, changing and unstructured, flexible and non context-dependent systems are required. The ability to understand situations, acknowledge changes and balance reactions is required by robots to successfully interact with their surroundings in a fully autonomous fashion. In fact, it is those very processes that define human interactions with the environment. Social relationships, driving or risk/incertitude management... in all these activities and systems, context understanding and adaptability are what allow human beings to survive: contrarily to the traditional robotics, people do not evaluate obstacles according to their position but according to the potential risk their presence imply. In this sense, human perception looks for information which goes beyond location, speed and dynamics (the usual data used in traditional obstacle avoidance systems). Specific features in the behaviour of a particular element allows the understanding of that element’s nature and therefore the comprehension of the risk posed by it. This process defines the second main difference between traditional obstacle avoidance systems and human behaviour: the ability to understand a situation/scenario allows to get to know the implications of the elements and their relationship with the observer. Establishing these semantic relationships -named cognition- is the only way to estimate the actual danger level of an element. Furthermore, only the application of this knowledge allows the generation of coherent, suitable and adjusted responses to deal with any risk faced. The research presented in this thesis summarizes the work done towards translating these human cognitive/reasoning procedures to the field of robotics. More specifically, the work done has been focused on employing human-based methodologies to enable aerial robots to navigate safely. To this effect, human perception, cognition and reaction processes concerning risk management have been experimentally studied; as well as the acquisition and processing of stimuli. How psychological, sociological and anthropological factors modify, balance and give shape to those stimuli has been researched. And finally, the way in which these factors motivate the human behaviour according to different mindsets and priorities has been established. This associative workflow has been reproduced by establishing an equivalent structure and defining similar factors and sources. Besides, all the knowledge obtained experimentally has been applied in the form of algorithms, techniques and strategies which emulate the analogous human behaviours. As a result, a framework capable of understanding and reacting in response to stimuli has been implemented and validated.

Hybridization Processes: The case of the urban game: Hybrid Hunt: Petrified

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The proposal highlights certain design strategies and a case study that can link the material urban space to digital emerging realms. The composite nature of urban spaces ?material/ digital- is understood as an opportunity to reconfigure public urban spaces without high-cost, difficult to apply interventions and, furthermore, to reactivate them by inserting dynamic, interactive and playful conditions that engage people and re-establish their relations to the cities. The structuring of coexisting and interconnected material and digital aspects in public urban spaces is proposed through the implementation of hybridization processes. Hybrid spaces can fascinate and provoke the public and especially younger people to get involved and interact with physical aspects of urban public spaces as well as digital representations or interpretations of those. Digital game?s design in urban public spaces can be comprehended as a tool that allows architects to understand and to configure hybrids of material and digital conceptions and project all in one, as an inseparable totality. Digital technologies have for a long time now intervened in our perception of traditional dipoles such as subject - environment. Architects, especially in the past, have been responsible for material mediations and tangible interfaces that permit subjects to relate to their physical environments in a controlled and regulated manner; but, nowadays, architects are compelled to embody in design, the transition that is happening in all aspects of everyday life, that is, from material to digital realities. In addition, the disjunctive relation of material and digital realms is ceding and architects are now faced with the challenge that supposes the merging of both in a single, all-inclusive reality. The case study is a design project for a game implemented simultaneously in a specific urban space and on the internet. This project developed as the spring semester course New Media in Architecture at the Department of Architecture, Democritus University of Thrace, Greece is situated at the city of Xanthi. Composite cities can use design strategies and technological tools to configure augmented and appealing urban spaces that articulate and connect different realms in a single engaging reality.

Visual attention and perception models for assessing quality in 2D and 3D stereoscopic video

Relevância:

30.00% 30.00%

Publicador:

Resumo:

La medida de calidad de vídeo sigue siendo necesaria para definir los criterios que caracterizan una señal que cumpla los requisitos de visionado impuestos por el usuario. Las nuevas tecnologías, como el vídeo 3D estereoscópico o formatos más allá de la alta definición, imponen nuevos criterios que deben ser analizadas para obtener la mayor satisfacción posible del usuario. Entre los problemas detectados durante el desarrollo de esta tesis doctoral se han determinado fenómenos que afectan a distintas fases de la cadena de producción audiovisual y tipo de contenido variado. En primer lugar, el proceso de generación de contenidos debe encontrarse controlado mediante parámetros que eviten que se produzca el disconfort visual y, consecuentemente, fatiga visual, especialmente en lo relativo a contenidos de 3D estereoscópico, tanto de animación como de acción real. Por otro lado, la medida de calidad relativa a la fase de compresión de vídeo emplea métricas que en ocasiones no se encuentran adaptadas a la percepción del usuario. El empleo de modelos psicovisuales y diagramas de atención visual permitirían ponderar las áreas de la imagen de manera que se preste mayor importancia a los píxeles que el usuario enfocará con mayor probabilidad. Estos dos bloques se relacionan a través de la definición del término saliencia. Saliencia es la capacidad del sistema visual para caracterizar una imagen visualizada ponderando las áreas que más atractivas resultan al ojo humano. La saliencia en generación de contenidos estereoscópicos se refiere principalmente a la profundidad simulada mediante la ilusión óptica, medida en términos de distancia del objeto virtual al ojo humano. Sin embargo, en vídeo bidimensional, la saliencia no se basa en la profundidad, sino en otros elementos adicionales, como el movimiento, el nivel de detalle, la posición de los píxeles o la aparición de caras, que serán los factores básicos que compondrán el modelo de atención visual desarrollado. Con el objetivo de detectar las características de una secuencia de vídeo estereoscópico que, con mayor probabilidad, pueden generar disconfort visual, se consultó la extensa literatura relativa a este tema y se realizaron unas pruebas subjetivas preliminares con usuarios. De esta forma, se llegó a la conclusión de que se producía disconfort en los casos en que se producía un cambio abrupto en la distribución de profundidades simuladas de la imagen, aparte de otras degradaciones como la denominada “violación de ventana”. A través de nuevas pruebas subjetivas centradas en analizar estos efectos con diferentes distribuciones de profundidades, se trataron de concretar los parámetros que definían esta imagen. Los resultados de las pruebas demuestran que los cambios abruptos en imágenes se producen en entornos con movimientos y disparidades negativas elevadas que producen interferencias en los procesos de acomodación y vergencia del ojo humano, así como una necesidad en el aumento de los tiempos de enfoque del cristalino. En la mejora de las métricas de calidad a través de modelos que se adaptan al sistema visual humano, se realizaron también pruebas subjetivas que ayudaron a determinar la importancia de cada uno de los factores a la hora de enmascarar una determinada degradación. Los resultados demuestran una ligera mejora en los resultados obtenidos al aplicar máscaras de ponderación y atención visual, los cuales aproximan los parámetros de calidad objetiva a la respuesta del ojo humano. ABSTRACT Video quality assessment is still a necessary tool for defining the criteria to characterize a signal with the viewing requirements imposed by the final user. New technologies, such as 3D stereoscopic video and formats of HD and beyond HD oblige to develop new analysis of video features for obtaining the highest user’s satisfaction. Among the problems detected during the process of this doctoral thesis, it has been determined that some phenomena affect to different phases in the audiovisual production chain, apart from the type of content. On first instance, the generation of contents process should be enough controlled through parameters that avoid the occurrence of visual discomfort in observer’s eye, and consequently, visual fatigue. It is especially necessary controlling sequences of stereoscopic 3D, with both animation and live-action contents. On the other hand, video quality assessment, related to compression processes, should be improved because some objective metrics are adapted to user’s perception. The use of psychovisual models and visual attention diagrams allow the weighting of image regions of interest, giving more importance to the areas which the user will focus most probably. These two work fields are related together through the definition of the term saliency. Saliency is the capacity of human visual system for characterizing an image, highlighting the areas which result more attractive to the human eye. Saliency in generation of 3DTV contents refers mainly to the simulated depth of the optic illusion, i.e. the distance from the virtual object to the human eye. On the other hand, saliency is not based on virtual depth, but on other features, such as motion, level of detail, position of pixels in the frame or face detection, which are the basic features that are part of the developed visual attention model, as demonstrated with tests. Extensive literature involving visual comfort assessment was looked up, and the development of new preliminary subjective assessment with users was performed, in order to detect the features that increase the probability of discomfort to occur. With this methodology, the conclusions drawn confirmed that one common source of visual discomfort was when an abrupt change of disparity happened in video transitions, apart from other degradations, such as window violation. New quality assessment was performed to quantify the distribution of disparities over different sequences. The results confirmed that abrupt changes in negative parallax environment produce accommodation-vergence mismatches derived from the increasing time for human crystalline to focus the virtual objects. On the other side, for developing metrics that adapt to human visual system, additional subjective tests were developed to determine the importance of each factor, which masks a concrete distortion. Results demonstrated slight improvement after applying visual attention to objective metrics. This process of weighing pixels approximates the quality results to human eye’s response.

Attentional processes associated with victimization history and posttraumatic symptomatology in women exposed to intimate partner violence

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Exposure to intimate partner violence (IPV) puts women at risk for severe and chronic physical and mental health consequences, including elevations in IPV-related psychopathology and increased risk for future victimization. Previous research has examined attention as one of the key information processing mechanisms associated with elevated psychopathology and risk for victimization; however, the nature of attentional processing in response to IPV-related information in women exposed to IPV is poorly understood. Therefore, the current study aimed to further understanding of associations between attentional processing, IPV exposure, and related distress using measures of eye movement and subjective interpretations of IPV-related information. A sample of women exposed to IPV (n = 57) viewed sets of negative, positive, and neutral relationship images for 15 s each while having their eye movements monitored and later provided subjective ratings and interpretations of levels of risk and safety in those images. We examined associations of outcome measures with proximal victimization experiences and IPV-related psychopathology (i.e., depression, posttraumatic stress disorder (PTSD), anxiety, and dissociation). Results indicated a bias to attend to negative relationship images relative to positive and neutral images, though this attention bias fluctuated over time and varied as a function of symptomatology such that depression corresponded with increases in attention to negative images over time and PTSD corresponded with decreases in attention to negative images. The general attention bias for negative images appeared to be explained by rumination on and/or difficulty disengaging from negative images, which was related to general elevations in psychopathology as well as exposure to revictimization by different perpetrators. Subjective interpretations and perception of danger cues were related to victimization history and level and type of IPV-related distress. We replicated these procedures with a sample of undergraduate students without IPV histories or related symptomatology (n = 33) and found that the overall attention bias for negative images was not replicated, despite general similarities in patterns of attention over time. Results therefore indicated associations between attentional processing and IPV exposure and related symptomatology. Implications for models of IPV-related psychopathology and attentional processing as well as directions for future study and interventions are discussed.

Children's auditory perception and cognitive processing skills in adverse listening situations

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-06

Auditory-visual speech integration by prelinguistic infants: Perception of an emergent consonant in the McGurk effect

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The McGurk effect, in which auditory [ba] dubbed onto [go] lip movements is perceived as da or tha, was employed in a real-time task to investigate auditory-visual speech perception in prelingual infants. Experiments 1A and 1B established the validity of real-time dubbing for producing the effect. In Experiment 2, 4(1)/(2)-month-olds were tested in a habituation-test paradigm, in which 2 an auditory-visual stimulus was presented contingent upon visual fixation of a live face. The experimental group was habituated to a McGurk stimulus (auditory [ba] visual [ga]), and the control group to matching auditory-visual [ba]. Each group was then presented with three auditory-only test trials, [ba], [da], and [deltaa] (as in then). Visual-fixation durations in test trials showed that the experimental group treated the emergent percept in the McGurk effect, [da] or [deltaa], as familiar (even though they had not heard these sounds previously) and [ba] as novel. For control group infants [da] and [deltaa] were no more familiar than [ba]. These results are consistent with infants'perception of the McGurk effect, and support the conclusion that prelinguistic infants integrate auditory and visual speech information. (C) 2004 Wiley Periodicals, Inc.

«
1
2
3
4
5
6
7
8
...
66
67
»