980 resultados para auditory scene analysis


Relevância:

100.00% 100.00%

Publicador:

Resumo:

While humans can easily segregate and track a speaker's voice in a loud noisy environment, most modern speech recognition systems still perform poorly in loud background noise. The computational principles behind auditory source segregation in humans is not yet fully understood. In this dissertation, we develop a computational model for source segregation inspired by auditory processing in the brain. To support the key principles behind the computational model, we conduct a series of electro-encephalography experiments using both simple tone-based stimuli and more natural speech stimulus. Most source segregation algorithms utilize some form of prior information about the target speaker or use more than one simultaneous recording of the noisy speech mixtures. Other methods develop models on the noise characteristics. Source segregation of simultaneous speech mixtures with a single microphone recording and no knowledge of the target speaker is still a challenge. Using the principle of temporal coherence, we develop a novel computational model that exploits the difference in the temporal evolution of features that belong to different sources to perform unsupervised monaural source segregation. While using no prior information about the target speaker, this method can gracefully incorporate knowledge about the target speaker to further enhance the segregation.Through a series of EEG experiments we collect neurological evidence to support the principle behind the model. Aside from its unusual structure and computational innovations, the proposed model provides testable hypotheses of the physiological mechanisms of the remarkable perceptual ability of humans to segregate acoustic sources, and of its psychophysical manifestations in navigating complex sensory environments. Results from EEG experiments provide further insights into the assumptions behind the model and provide motivation for future single unit studies that can provide more direct evidence for the principle of temporal coherence.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Everyday, humans and animals navigate complex acoustic environments, where multiple sound sources overlap. Somehow, they effortlessly perform an acoustic scene analysis and extract relevant signals from background noise. Constant updating of the behavioral relevance of ambient sounds requires the representation and integration of incoming acoustical information with internal representations such as behavioral goals, expectations and memories of previous sound-meaning associations. Rapid plasticity of auditory representations may contribute to our ability to attend and focus on relevant sounds. In order to better understand how auditory representations are transformed in the brain to incorporate behavioral contextual information, we explored task-dependent plasticity in neural responses recorded at four levels of the auditory cortical processing hierarchy of ferrets: the primary auditory cortex (A1), two higher-order auditory areas (dorsal PEG and ventral-anterior PEG) and dorso-lateral frontal cortex. In one study we explored the laminar profile of rapid-task related plasticity in A1 and found that plasticity occurred at all depths, but was greatest in supragranular layers. This result suggests that rapid task-related plasticity in A1 derives primarily from intracortical modulation of neural selectivity. In two other studies we explored task-dependent plasticity in two higher-order areas of the ferret auditory cortex that may correspond to belt (secondary) and parabelt (tertiary) auditory areas. We found that representations of behaviorally-relevant sounds are progressively enhanced during performance of auditory tasks. These selective enhancement effects became progressively larger as you ascend the auditory cortical hierarchy. We also observed neuronal responses to non-auditory, task-related information (reward timing, expectations) in the parabelt area that were very similar to responses previously described in frontal cortex. These results suggests that auditory representations in the brain are transformed from the more veridical spectrotemporal information encoded in earlier auditory stages to a more abstract representation encoding sound behavioral meaning in higher-order auditory areas and dorso-lateral frontal cortex.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We propose a probabilistic object classifier for outdoor scene analysis as a first step in solving the problem of scene context generation. The method begins with a top-down control, which uses the previously learned models (appearance and absolute location) to obtain an initial pixel-level classification. This information provides us the core of objects, which is used to acquire a more accurate object model. Therefore, their growing by specific active regions allows us to obtain an accurate recognition of known regions. Next, a stage of general segmentation provides the segmentation of unknown regions by a bottom-strategy. Finally, the last stage tries to perform a region fusion of known and unknown segmented objects. The result is both a segmentation of the image and a recognition of each segment as a given object class or as an unknown segmented object. Furthermore, experimental results are shown and evaluated to prove the validity of our proposal

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Recent interest in the validation of general circulation models (GCMs) has been devoted to objective methods. A small number of authors have used the direct synoptic identification of phenomena together with a statistical analysis to perform the objective comparison between various datasets. This paper describes a general method for performing the synoptic identification of phenomena that can be used for an objective analysis of atmospheric, or oceanographic, datasets obtained from numerical models and remote sensing. Methods usually associated with image processing have been used to segment the scene and to identify suitable feature points to represent the phenomena of interest. This is performed for each time level. A technique from dynamic scene analysis is then used to link the feature points to form trajectories. The method is fully automatic and should be applicable to a wide range of geophysical fields. An example will be shown of results obtained from this method using data obtained from a run of the Universities Global Atmospheric Modelling Project GCM.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Integrating information from multiple sources is a crucial function of the brain. Examples of such integration include multiple stimuli of different modalties, such as visual and auditory, multiple stimuli of the same modality, such as auditory and auditory, and integrating stimuli from the sensory organs (i.e. ears) with stimuli delivered from brain-machine interfaces.

The overall aim of this body of work is to empirically examine stimulus integration in these three domains to inform our broader understanding of how and when the brain combines information from multiple sources.

First, I examine visually-guided auditory, a problem with implications for the general problem in learning of how the brain determines what lesson to learn (and what lessons not to learn). For example, sound localization is a behavior that is partially learned with the aid of vision. This process requires correctly matching a visual location to that of a sound. This is an intrinsically circular problem when sound location is itself uncertain and the visual scene is rife with possible visual matches. Here, we develop a simple paradigm using visual guidance of sound localization to gain insight into how the brain confronts this type of circularity. We tested two competing hypotheses. 1: The brain guides sound location learning based on the synchrony or simultaneity of auditory-visual stimuli, potentially involving a Hebbian associative mechanism. 2: The brain uses a ‘guess and check’ heuristic in which visual feedback that is obtained after an eye movement to a sound alters future performance, perhaps by recruiting the brain’s reward-related circuitry. We assessed the effects of exposure to visual stimuli spatially mismatched from sounds on performance of an interleaved auditory-only saccade task. We found that when humans and monkeys were provided the visual stimulus asynchronously with the sound but as feedback to an auditory-guided saccade, they shifted their subsequent auditory-only performance toward the direction of the visual cue by 1.3-1.7 degrees, or 22-28% of the original 6 degree visual-auditory mismatch. In contrast when the visual stimulus was presented synchronously with the sound but extinguished too quickly to provide this feedback, there was little change in subsequent auditory-only performance. Our results suggest that the outcome of our own actions is vital to localizing sounds correctly. Contrary to previous expectations, visual calibration of auditory space does not appear to require visual-auditory associations based on synchrony/simultaneity.

My next line of research examines how electrical stimulation of the inferior colliculus influences perception of sounds in a nonhuman primate. The central nucleus of the inferior colliculus is the major ascending relay of auditory information before it reaches the forebrain, and thus an ideal target for understanding low-level information processing prior to the forebrain, as almost all auditory signals pass through the central nucleus of the inferior colliculus before reaching the forebrain. Thus, the inferior colliculus is the ideal structure to examine to understand the format of the inputs into the forebrain and, by extension, the processing of auditory scenes that occurs in the brainstem. Therefore, the inferior colliculus was an attractive target for understanding stimulus integration in the ascending auditory pathway.

Moreover, understanding the relationship between the auditory selectivity of neurons and their contribution to perception is critical to the design of effective auditory brain prosthetics. These prosthetics seek to mimic natural activity patterns to achieve desired perceptual outcomes. We measured the contribution of inferior colliculus (IC) sites to perception using combined recording and electrical stimulation. Monkeys performed a frequency-based discrimination task, reporting whether a probe sound was higher or lower in frequency than a reference sound. Stimulation pulses were paired with the probe sound on 50% of trials (0.5-80 µA, 100-300 Hz, n=172 IC locations in 3 rhesus monkeys). Electrical stimulation tended to bias the animals’ judgments in a fashion that was coarsely but significantly correlated with the best frequency of the stimulation site in comparison to the reference frequency employed in the task. Although there was considerable variability in the effects of stimulation (including impairments in performance and shifts in performance away from the direction predicted based on the site’s response properties), the results indicate that stimulation of the IC can evoke percepts correlated with the frequency tuning properties of the IC. Consistent with the implications of recent human studies, the main avenue for improvement for the auditory midbrain implant suggested by our findings is to increase the number and spatial extent of electrodes, to increase the size of the region that can be electrically activated and provide a greater range of evoked percepts.

My next line of research employs a frequency-tagging approach to examine the extent to which multiple sound sources are combined (or segregated) in the nonhuman primate inferior colliculus. In the single-sound case, most inferior colliculus neurons respond and entrain to sounds in a very broad region of space, and many are entirely spatially insensitive, so it is unknown how the neurons will respond to a situation with more than one sound. I use multiple AM stimuli of different frequencies, which the inferior colliculus represents using a spike timing code. This allows me to measure spike timing in the inferior colliculus to determine which sound source is responsible for neural activity in an auditory scene containing multiple sounds. Using this approach, I find that the same neurons that are tuned to broad regions of space in the single sound condition become dramatically more selective in the dual sound condition, preferentially entraining spikes to stimuli from a smaller region of space. I will examine the possibility that there may be a conceptual linkage between this finding and the finding of receptive field shifts in the visual system.

In chapter 5, I will comment on these findings more generally, compare them to existing theoretical models, and discuss what these results tell us about processing in the central nervous system in a multi-stimulus situation. My results suggest that the brain is flexible in its processing and can adapt its integration schema to fit the available cues and the demands of the task.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Analysis of either footprints or footwear impressions which have been recovered from a crime scene is a well known and well accepted part of forensic investigation. When this evidence is obtained by investigating officers, comparative analysis to a suspect’s evidence may be undertaken. This can be done either by the detectives or in some cases, podiatrists with experience in forensic analysis. Frequently asked questions of a podiatrist include; “What additional information should be collected from a suspect (for the purposes of comparison), and how should it be collected?” This paper explores the answers to these and related questions based on 20 years of practical experience in the field of crime scene analysis as it relates to podiatry and forensics. Elements of normal and abnormal foot function are explored and used to explain the high degree of variability in wear patterns produced by the interaction of the foot and footwear. Based on this understanding the potential for identifying unique features of the user and correlating this to footwear evidence becomes apparent. Standard protocols adopted by podiatrists allow for more precise, reliable, and valid results to be obtained from their analysis. Complex data sets are now being obtained by investigating officers and, in collaboration with the podiatrist; higher quality conclusions are being achieved. This presentation details the results of investigations which have used standard protocols to collect and analyse footwear and suspects of recent major crimes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An on-road study was conducted to evaluate a complementary tactile navigation signal on driving behaviour and eye movements for drivers with hearing loss (HL) compared to drivers with normal hearing (NH). 32 participants (16 HL and 16 NH) performed two preprogrammed navigation tasks. In one, participants received only visual information, while the other also included a vibration in the seat to guide them in the correct direction. SMI glasses were used for eye tracking, recording the point of gaze within the scene. Analysis was performed on predefined regions. A questionnaire examined participant's experience of the navigation systems. Hearing loss was associated with lower speed, higher satisfaction with the tactile signal and more glances in the rear view mirror. Additionally, tactile support led to less time spent viewing the navigation display.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The issue of dynamic spectrum scene analysis in any cognitive radio network becomes extremely complex when low probability of intercept, spread spectrum systems are present in environment. The detection and estimation become more complex if frequency hopping spread spectrum is adaptive in nature. In this paper, we propose two phase approach for detection and estimation of frequency hoping signals. Polyphase filter bank has been proposed as the architecture of choice for detection phase to efficiently detect the presence of frequency hopping signal. Based on the modeling of frequency hopping signal it can be shown that parametric methods of line spectral analysis are well suited for estimation of frequency hopping signals if the issues of order estimation and time localization are resolved. An algorithm using line spectra parameter estimation and wavelet based transient detection has been proposed which resolves above issues in computationally efficient manner suitable for implementation in cognitive radio. The simulations show promising results proving that adaptive frequency hopping signals can be detected and demodulated in a non cooperative context, even at a very low signal to noise ratio in real time.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The issue of dynamic spectrum scene analysis in any cognitive radio network becomes extremely complex when low probability of intercept, spread spectrum systems are present in environment. The detection and estimation become more complex if frequency hopping spread spectrum is adaptive in nature. In this paper, we propose two phase approach for detection and estimation of frequency hoping signals. Polyphase filter bank has been proposed as the architecture of choice for detection phase to efficiently detect the presence of frequency hopping signal. Based on the modeling of frequency hopping signal it can be shown that parametric methods of line spectral analysis are well suited for estimation of frequency hopping signals if the issues of order estimation and time localization are resolved. An algorithm using line spectra parameter estimation and wavelet based transient detection has been proposed which resolves above issues in computationally efficient manner suitable for implementation in cognitive radio. The simulations show promising results proving that adaptive frequency hopping signals can be detected and demodulated in a non cooperative context, even at a very low signal to noise ratio in real time.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The research reported here concerns the principles used to automatically generate three-dimensional representations from line drawings of scenes. The computer programs involved look at scenes which consist of polyhedra and which may contain shadows and various kinds of coincidentally aligned scene features. Each generated description includes information about edge shape (convex, concave, occluding, shadow, etc.), about the type of illumination for each region (illuminated, projected shadow, or oriented away from the light source), and about the spacial orientation of regions. The methods used are based on the labeling schemes of Huffman and Clowes; this research provides a considerable extension to their work and also gives theoretical explanations to the heuristic scene analysis work of Guzman, Winston, and others.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An active, attentionally-modulated recognition architecture is proposed for object recognition and scene analysis. The proposed architecture forms part of navigation and trajectory planning modules for mobile robots. Key characteristics of the system include movement planning and execution based on environmental factors and internal goal definitions. Real-time implementation of the system is based on space-variant representation of the visual field, as well as an optimal visual processing scheme utilizing separate and parallel channels for the extraction of boundaries and stimulus qualities. A spatial and temporal grouping module (VWM) allows for scene scanning, multi-object segmentation, and featural/object priming. VWM is used to modulate a tn~ectory formation module capable of redirecting the focus of spatial attention. Finally, an object recognition module based on adaptive resonance theory is interfaced through VWM to the visual processing module. The system is capable of using information from different modalities to disambiguate sensory input.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Resumo I (Prática Pedagógica) - Nesta secção do relatório de estágio pretende-se expor e caracterizar todos os elementos referentes à prática pedagógica, integrada no estágio, que ocorreu em parceria com a Escola de Música do Conservatório Nacional (EMCN) e o Projecto Orquestra Geração realizado no ano letivo de 2014/2015. Este tem por objetivo, a abordagem dos aspetos pedagógicos, as metodologias de ensino, a eficácia do trabalho, a motivação e os agentes motivacionais. Para a elaboração deste relatório, foram observados e devidamente analisados três alunos de forma peculiar. Porém, é de realçar que um dos intervenientes pertencia à EMCN e as aulas apenas eram observadas. Ao longo do ano letivo foram elaborados individualmente para cada aluno, vinte e dois planos de aula e ainda realizadas três gravações de aulas que foram vistas pelo professor orientador. À posteriori serão caraterizadas as duas entidades que fizeram parceria, contribuindo para a possível realização do relatório de estágio. Será abordada a sua contextualização histórica, os parâmetros de ensino, o contexto sociocultural. De seguida, será apresentada a caraterização dos três alunos, abordando nomeadamente, o seu meio envolvente, os métodos de estudo e a evolução motora, expressiva e auditiva. Serão analisadas e descritas algumas das práticas de ensino, desenvolvidas quer ao longo das aulas individuais, quer da observação das gravações e da análise constante dos planos de aulas e das planificações. Para finalizar o relatório de estágio, será elaborada uma reflexão crítica, acerca do desempenho como docente e respetiva prática pedagógica trabalhada com os alunos.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Human electrophysiological studies support a model whereby sensitivity to so-called illusory contour stimuli is first seen within the lateral occipital complex. A challenge to this model posits that the lateral occipital complex is a general site for crude region-based segmentation, based on findings of equivalent hemodynamic activations in the lateral occipital complex to illusory contour and so-called salient region stimuli, a stimulus class that lacks the classic bounding contours of illusory contours. Using high-density electrical mapping of visual evoked potentials, we show that early lateral occipital cortex activity is substantially stronger to illusory contour than to salient region stimuli, whereas later lateral occipital complex activity is stronger to salient region than to illusory contour stimuli. Our results suggest that equivalent hemodynamic activity to illusory contour and salient region stimuli probably reflects temporally integrated responses, a result of the poor temporal resolution of hemodynamic imaging. The temporal precision of visual evoked potentials is critical for establishing viable models of completion processes and visual scene analysis. We propose that crude spatial segmentation analyses, which are insensitive to illusory contours, occur first within dorsal visual regions, not the lateral occipital complex, and that initial illusory contour sensitivity is a function of the lateral occipital complex.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A new method for the automated selection of colour features is described. The algorithm consists of two stages of processing. In the first, a complete set of colour features is calculated for every object of interest in an image. In the second stage, each object is mapped into several n-dimensional feature spaces in order to select the feature set with the smallest variables able to discriminate the remaining objects. The evaluation of the discrimination power for each concrete subset of features is performed by means of decision trees composed of linear discrimination functions. This method can provide valuable help in outdoor scene analysis where no colour space has been demonstrated as being the most suitable. Experiment results recognizing objects in outdoor scenes are reported

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Under the framework of the European Union Funded SAFEE project(1), this paper gives an overview of a novel monitoring and scene analysis system developed for use onboard aircraft in spatially constrained environments. The techniques discussed herein aim to warn on-board crew about pre-determined indicators of threat intent (such as running or shouting in the cabin), as elicited from industry and security experts. The subject matter experts believe that activities such as these are strong indicators of the beginnings of undesirable chains of events or scenarios, which should not be allowed to develop aboard aircraft. This project aimes to detect these scenarios and provide advice to the crew. These events may involve unruly passengers or be indicative of the precursors to terrorist threats. With a state of the art tracking system using homography intersections of motion images, and probability based Petri nets for scene understanding, the SAFEE behavioural analysis system automatically assesses the output from multiple intelligent sensors, and creates. recommendations that are presented to the crew using an integrated airborn user interface. Evaluation of the system is conducted within a full size aircraft mockup, and experimental results are presented, showing that the SAFEE system is well suited to monitoring people in confined environments, and that meaningful and instructive output regarding human actions can be derived from the sensor network within the cabin.