976 resultados para Auditory perception.


Relevância:

30.00% 30.00%

Publicador:

Resumo:

We address the problem of estimating the fundamental frequency of voiced speech. We present a novel solution motivated by the importance of amplitude modulation in sound processing and speech perception. The new algorithm is based on a cumulative spectrum computed from the temporal envelope of various subbands. We provide theoretical analysis to derive the new pitch estimator based on the temporal envelope of the bandpass speech signal. We report extensive experimental performance for synthetic as well as natural vowels for both realworld noisy and noise-free data. Experimental results show that the new technique performs accurate pitch estimation and is robust to noise. We also show that the technique is superior to the autocorrelation technique for pitch estimation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Multiple sound sources often contain harmonics that overlap and may be degraded by environmental noise. The auditory system is capable of teasing apart these sources into distinct mental objects, or streams. Such an "auditory scene analysis" enables the brain to solve the cocktail party problem. A neural network model of auditory scene analysis, called the AIRSTREAM model, is presented to propose how the brain accomplishes this feat. The model clarifies how the frequency components that correspond to a give acoustic source may be coherently grouped together into distinct streams based on pitch and spatial cues. The model also clarifies how multiple streams may be distinguishes and seperated by the brain. Streams are formed as spectral-pitch resonances that emerge through feedback interactions between frequency-specific spectral representaion of a sound source and its pitch. First, the model transforms a sound into a spatial pattern of frequency-specific activation across a spectral stream layer. The sound has multiple parallel representations at this layer. A sound's spectral representation activates a bottom-up filter that is sensitive to harmonics of the sound's pitch. The filter activates a pitch category which, in turn, activate a top-down expectation that allows one voice or instrument to be tracked through a noisy multiple source environment. Spectral components are suppressed if they do not match harmonics of the top-down expectation that is read-out by the selected pitch, thereby allowing another stream to capture these components, as in the "old-plus-new-heuristic" of Bregman. Multiple simultaneously occuring spectral-pitch resonances can hereby emerge. These resonance and matching mechanisms are specialized versions of Adaptive Resonance Theory, or ART, which clarifies how pitch representations can self-organize durin learning of harmonic bottom-up filters and top-down expectations. The model also clarifies how spatial location cues can help to disambiguate two sources with similar spectral cures. Data are simulated from psychophysical grouping experiments, such as how a tone sweeping upwards in frequency creates a bounce percept by grouping with a downward sweeping tone due to proximity in frequency, even if noise replaces the tones at their interection point. Illusory auditory percepts are also simulated, such as the auditory continuity illusion of a tone continuing through a noise burst even if the tone is not present during the noise, and the scale illusion of Deutsch whereby downward and upward scales presented alternately to the two ears are regrouped based on frequency proximity, leading to a bounce percept. Since related sorts of resonances have been used to quantitatively simulate psychophysical data about speech perception, the model strengthens the hypothesis the ART-like mechanisms are used at multiple levels of the auditory system. Proposals for developing the model to explain more complex streaming data are also provided.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A neural model of peripheral auditory processing is described and used to separate features of coarticulated vowels and consonants. After preprocessing of speech via a filterbank, the model splits into two parallel channels, a sustained channel and a transient channel. The sustained channel is sensitive to relatively stable parts of the speech waveform, notably synchronous properties of the vocalic portion of the stimulus it extends the dynamic range of eighth nerve filters using coincidence deteectors that combine operations of raising to a power, rectification, delay, multiplication, time averaging, and preemphasis. The transient channel is sensitive to critical features at the onsets and offsets of speech segments. It is built up from fast excitatory neurons that are modulated by slow inhibitory interneurons. These units are combined over high frequency and low frequency ranges using operations of rectification, normalization, multiplicative gating, and opponent processing. Detectors sensitive to frication and to onset or offset of stop consonants and vowels are described. Model properties are characterized by mathematical analysis and computer simulations. Neural analogs of model cells in the cochlear nucleus and inferior colliculus are noted, as are psychophysical data about perception of CV syllables that may be explained by the sustained transient channel hypothesis. The proposed sustained and transient processing seems to be an auditory analog of the sustained and transient processing that is known to occur in vision.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Listeners experience electroacoustic music as full of significance and meaning, and they experience spatiality as one of the factors contributing to its meaningfulness. If we want to understand spatiality in electroacoustic music, we must understand how the listener’s mental processes give rise to the experience of meaning. In electroacoustic music as in everyday life, these mental processes unite the peripheral auditory system with human spatial cognition. In the discussion that follows we consider a range of the listener’s mental processes relating space and meaning from the perceptual attributes of spatial imagery to the spatial reference frames for places and navigation. When considering multichannel loudspeaker systems in particular, an important part of the discussion is focused on the distinctive and idiomatic ways in which this particular mode of sound production contributes to and situates meaning. These idiosyncrasies include the phenomenon of image dispersion, the important consequences of the precedence effect and the influence of source characteristics on spatial imagery. These are discussed in close relation to the practicalities of artistic practice and to the potential for artistic meaning experienced by the listener.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In a recent study, we reported that the accurate perception of beat structure in music ('perception of musical meter') accounted for over 40% of the variance in single word reading in children with and without dyslexia (Huss et al., 2011). Performance in the musical task was most strongly associated with the auditory processing of rise time, even though beat structure was varied by manipulating the duration of the musical notes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Despite its importance in social interactions, laughter remains little studied in affective computing. Intelligent virtual agents are often blind to users’ laughter and unable to produce convincing laughter themselves. Respiratory, auditory, and facial laughter signals have been investigated but laughter-related body movements have received less attention. The aim of this study is threefold. First, to probe human laughter perception by analyzing patterns of categorisations of natural laughter animated on a minimal avatar. Results reveal that a low dimensional space can describe perception of laughter “types”. Second, to investigate observers’ perception of laughter (hilarious, social, awkward, fake, and non-laughter) based on animated avatars generated from natural and acted motion-capture data. Significant differences in torso and limb movements are found between animations perceived as laughter and those perceived as non-laughter. Hilarious laughter also differs from social laughter. Different body movement features were indicative of laughter in sitting and standing avatar postures. Third, to investigate automatic recognition of laughter to the same level of certainty as observers’ perceptions. Results show recognition rates of the Random Forest model approach human rating levels. Classification comparisons and feature importance analyses indicate an improvement in recognition of social laughter when localized features and nonlinear models are used.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Il est bien connu que les enfants qui présentent un trouble de traitement auditif (TTA) ont de la difficulté à percevoir la parole en présence de bruit de fond. Cependant, il n’existe aucun consensus quant à l’origine de ces difficultés d’écoute. Ce programme de recherche est consacré à l’étude des incapacités sous-jacentes aux problèmes de perception de la parole dans le bruit chez les enfants présentant un TTA. Le Test de Phrases dans le Bruit (TPB) a été développé afin d’examiner si les difficultés de perception de la parole dans le bruit d’enfants ayant un TTA relèvent d’incapacités auditives, d’incapacités cognitivo-linguistiques ou des deux à la fois. Il comprend cinq listes de 40 phrases, composées de 20 phrases hautement prévisibles (HP) et de 20 phrases faiblement prévisibles (FP), de même qu’un bruit de verbiage. Le niveau de connaissance du mot clé (mot final) de chaque phrase a été vérifié auprès d’un groupe d’enfants âgés entre 5 et 7 ans. De plus, le degré d’intelligibilité des phrases dans le bruit et le niveau de prévisibilité ont été mesurées auprès d’adultes pour assurer l’équivalence entre les listes. Enfin, le TPB a été testé auprès d’un groupe de 15 adultes et d’un groupe de 69 enfants sans trouble auditif avant de l’administrer à des enfants ayant un TTA. Pour répondre à l’objectif général du programme de recherche, dix enfants présentant un TTA (groupe TTA) et dix enfants jumelés selon le genre et l’âge sans difficulté auditive (groupe témoin) ont été soumis aux listes de phrases du TPB selon différentes conditions sonores. Le groupe TTA a obtenu des performances significativement plus faibles comparativement au groupe témoin à la tâche de reconnaissance du mot final des phrases présentées en même temps qu’un bruit de verbiage compétitif, aux rapports signal-sur-bruit de 0, +3 et +4 dB. La moyenne de la différence des scores obtenue entre les phrases HP et FP à chaque condition expérimentale de bruit était similaire entre les deux groupes. Ces résultats suggèrent que les enfants ayant un TTA ne se distinguent pas des enfants du groupe témoin au plan de la compétence cognitivo-linguistique. L’origine des difficultés d’écoute de la parole dans le bruit dans le cas de TTA serait de nature auditive. Toutefois, les résultats des analyses de groupe diffèrent de ceux des analyses individuelles. Les divers profils de difficultés d’écoute identifiés auprès de cette cohorte appuient l’importance de continuer les investigations afin de mieux comprendre l’origine des problèmes de perception de la parole dans le bruit dans le cas de TTA. En connaissant mieux la nature de ces difficultés, il sera possible d’identifier les stratégies d’intervention de réadaptation spécifiques et efficaces.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

La perception est décrite comme l’ensemble des processus permettant au cerveau de recueillir et de traiter l’information sensorielle. Un traitement perceptif atypique se retrouve souvent associé au phénotype autistique habituellement décrit en termes de déficits des habilités sociales et de communication ainsi que par des comportements stéréotypés et intérêts restreints. Les particularités perceptives des autistes se manifestent à différents niveaux de traitement de l’information; les autistes obtiennent des performances supérieures à celles des non autistes pour discriminer des stimuli simples, comme des sons purs, ou encore pour des tâches de plus haut niveau comme la détection de formes enchevêtrées dans une figure complexe. Spécifiquement pour le traitement perceptif de bas niveau, on rapporte une dissociation de performance en vision. En effet, les autistes obtiennent des performances supérieures pour discriminer les stimuli définis par la luminance et inférieures pour les stimuli définis par la texture en comparaison à des non autistes. Ce pattern dichotomique a mené à l’élaboration d’une hypothèse suggérant que l’étendue (ou complexité) du réseau de régions corticales impliquées dans le traitement des stimuli pourrait sous-tendre ces différences comportementales. En effet, les autistes obtiennent des performances supérieures pour traiter les stimuli visuels entièrement décodés au niveau d’une seule région corticale (simples) et inférieures pour les stimuli dont l’analyse requiert l’implication de plusieurs régions corticales (complexes). Un traitement perceptif atypique représente une caractéristique générale associée au phénotype autistique, avec de particularités rapportées tant dans la modalité visuelle qu’auditive. Étant donné les parallèles entre ces deux modalités sensorielles, cette thèse vise à vérifier si l’hypothèse proposée pour expliquer certaines particularités du traitement de l’information visuelle peut possiblement aussi caractériser le traitement de l’information auditive dans l’autisme. Le premier article (Chapitre 2) expose le niveau de performance des autistes, parfois supérieur, parfois inférieur à celui des non autistes lors du traitement de l’information auditive et suggère que la complexité du matériel auditif à traiter pourrait être en lien avec certaines des différences observées. Le deuxième article (Chapitre 3) présente une méta-analyse quantitative investiguant la représentation au niveau cortical de la complexité acoustique chez les non autistes. Ce travail confirme l’organisation fonctionnelle hiérarchique du cortex auditif et permet d’identifier, comme en vision, des stimuli auditifs pouvant être définis comme simples et complexes selon l’étendue du réseau de régions corticales requises pour les traiter. Le troisième article (Chapitre 4) vérifie l’extension des prédictions de l’hypothèse proposée en vision au traitement de l’information auditive. Spécifiquement, ce projet compare les activations cérébrales sous-tendant le traitement des sons simples et complexes chez des autistes et des non autistes. Tel qu’attendu, les autistes montrent un patron d’activité atypique en réponse aux stimuli complexes, c’est-à-dire ceux dont le traitement nécessitent l’implication de plusieurs régions corticales. En bref, l’ensemble des résultats suggèrent que les prédictions de l’hypothèse formulée en vision peuvent aussi s’appliquer en audition et possiblement expliquer certaines particularités du traitement de l’information auditive dans l’autisme. Ce travail met en lumière des différences fondamentales du traitement perceptif contribuant à une meilleure compréhension des mécanismes d’acquisition de l’information dans cette population.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper is a review of a study to determine if perception through rhythm is contingent upon auditory experience.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study evaluates the progress of children with cochlear implants on the Speech Perception Instructional Curriculum and Evaluation (SPICE) auditory training protocol.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper studies the effect of residual hearing on post-implant speech perception in children with cochlear implants. The effect of pre-implant auditory experience and the effect of neuronal survival in the implanted ear were investigated.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study examines specific auditory features perceived by profoundly hearing-impaired children using conventional binaural hearing aids and the Nucleus 22 Channel Cochlear Implant. The primary interest of this study was to learn which speech features were most easily perceived by users of each device.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper discusses a study done with chinchillas and their ability to organize speech sounds into auditory concepts.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper reviews a study of cross-modalities and within-modalities and their effects on speech perception.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A speech message played several metres from the listener in a room is usually heard to have much the same phonetic content as it does when played nearby, even though the different amounts of reflected sound make the temporal envelopes of these signals very different. To study this ‘constancy’ effect, listeners heard speech messages and speech-like sounds comprising 8 auditory-filter shaped noise-bands that had temporal envelopes corresponding to those in these filters when the speech message is played. The ‘contexts’ were “next you’ll get _to click on”, into which a “sir” or “stir” test word was inserted. These test words were from an 11-step continuum, formed by amplitude modulation. Listeners identified the test words appropriately, even in the 8-band conditions where the speech had a ‘robotic’ quality. Constancy was assessed by comparing the influence of room reflections on the test word across conditions where the context had either the same level of room reflections (i.e. from the same, far distance), or where it had a much lower level (i.e. from nearby). Constancy effects were obtained with both the natural- and the 8-band speech. Results are considered in terms of the degree of ‘matching’ between the context’s and test-word’s bands.