995 resultados para VOICE QUALITY
Pós-graduação em Bases Gerais da Cirurgia - FMB
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Various types of trill exercises have been used for a long time as a tool in the treatment and preparation of the voice. Although they are reported to produce vocal benefits in most subjects, their physiology has not yet been studied in depth. The aim of this study was to compare the mean and standard deviation of the closed quotient in exercises of lip and tongue trills with the sustained vowel /epsilon/ in opera singers. Ten professional classical (operatic) singers, reportedly in perfect laryngeal health, served as subjects for this study and underwent electroglottography. During the examination, the subjects were instructed to deliver the sustained vowel /epsilon/ and lip and tongue trills in a same preestablished frequency and intensity. The mean values and standard deviation of the closed quotient were obtained using the software developed for this purpose. The comparison of the results was intrasubjects; maximum intensities were compared only among them and so were minimum intensities. The means of closed quotient were statistically significant only in the strong intensities, and the lip trill was different from the tongue trill and the sustained vowel /epsilon/. The standard deviation of the closed quotient distinguished the sustained vowel /epsilon/ from the lip and tongue trills in the two intensities. We concluded that there is oscillation of the closed quotient during the exercises of tongue and lip trills, and the closed quotient is higher during the performance of exercises of the lip trill, when compared with the two other utterances, only in the strong intensities.
Objectives. To evaluate whether the overall dysphonia grade, roughness, breathiness, asthenia, and strain (GRBAS) scale, and the Consensus Auditory Perceptual Evaluation-Voice (CAPE-V) scale show the same reliability and consensus when applied to the same vocal sample at different times. Study Design. Observational cross-sectional study. Methods. Sixty subjects had their voices recorded according to the tasks proposed in the CAPE-V scale. Vowels /a/ and /i/ were sustained between 3 and 5 seconds. Reproduction of six sentences and spontaneous speech from the request "Tell me about your voice" were analyzed. For the analysis of the GRBAS scale, the sustained vowel and reading tasks of the sentences was used. Auditory-perceptual voice analyses were conducted by three expert speech therapists with more than 5 years of experience and familiar with both the scales. Results. A strong correlation was observed in the intrajudge consensus analysis, both for the GRBAS scale as well as for CAPE-V, with intraclass coefficient values ranging from 0.923 to 0.985. A high degree of correlation between the general GRBAS and CAPE-V grades (coefficient = 0.842) was observed, with similarities in the grades of dysphonia distribution in both scales. The evaluators indicated a mild difficulty in applying the GRBAS scale and low to mild difficulty in applying the CAPE-V scale. The three evaluators agreed when indicating the GRBAS scale as the fastest and the CAPE-V scale as the most sensitive, especially for detecting small changes in voice. Conclusions. The two scales are reliable and are indicated for use in analyzing voice quality.
OBJETIVO: Analisar o impacto de treino auditivo na avaliação perceptivo-auditiva da voz realizada por estudantes de Fonoaudiologia. MÉTODOS: Durante dois semestres, 17 estudantes que cursavam disciplinas teóricas de fonação (Fonação/Distúrbios da Fonação) analisaram amostras de vozes alteradas e não alteradas (selecionadas para este estudo), por meio da escala GRBAS. Todos receberam treinamento auditivo durante um total de nove encontros semanais, com cerca de 15 minutos de duração cada. Em cada encontro foi apresentado um parâmetro, por meio de vozes diferentes da amostra avaliada, com predominância no aspecto treinado. A avaliação das amostras por meio da escala foi realizada pré e pós o treinamento e em outros quatro momentos ao longo dos encontros. As avaliações dos alunos foram comparadas com uma avaliação de juízas, realizada previamente por três fonoaudiólogos, especialistas em voz. Para verificar a efetividade do treinamento foi usado o teste de Friedman e Índice de Concordância Kappa. RESULTADOS: O índice de acertos dos alunos no momento pré-treinamento foi considerado entre regular e bom. Observou-se manutenção do número de acertos ao longo das avaliações realizadas, para a maioria dos parâmetros da escala. No momento pós-treinamento observou-se melhora na análise da astenia, parâmetro enfatizado a partir das dificuldades apresentadas pelos alunos. Houve diminuição dos acertos no parâmetro rugosidade após este ter sido trabalhado de maneira segmentada em rouquidão e aspereza, e associado a diferentes diagnósticos e parâmetros acústicos. CONCLUSÃO: O treino auditivo potencializa as habilidades iniciais dos alunos, refinando-as para realização da avaliação, além de nortear ajustes em dinâmicas das disciplinas.
OBJETIVO: Realizar uma revisão sistemática de pesquisas relacionadas às características vocais de crianças ou adultos com deficiência auditiva usuários de implante coclear. ESTRATÉGIAS DE PESQUISA: Foi realizada uma busca com os descritores voz, qualidade da voz e implante coclear, e seus respectivos correspondentes na língua inglesa, nas bases de dados Web of Science, Bireme, portal de teses e dissertações da USP e banco de teses e dissertações da CAPES. CRITÉRIOS DE SELEÇÃO: Os critérios adotados incluíram título condizente com a proposta deste estudo, casuística necessariamente englobando crianças ou adultos com deficiência auditiva de grau severo a profundo, pré ou pós-linguais, usuários de implante coclear e que tenham passado por análise perceptivo-auditiva e/ou acústica da qualidade vocal. RESULTADOS: Vinte e sete trabalhos foram classificados seguindo-se os níveis de evidências e indicadores de qualidade empregados pela American Speech-Language-Hearing Association (ASHA). Os desenhos dos trabalhos analisados foram considerados de média e baixa evidência científica. Seis trabalhos foram classificados como nível de evidência IIb, 20 como III, e um como IV. CONCLUSÃO: A qualidade vocal da criança ou adulto com deficiência auditiva usuário de implante coclear tem sido estudada em pequena escala. Não há um número efetivo de estudos com alto índice de evidência que demonstrem com precisão os efeitos do implante coclear na qualidade vocal desses indivíduos.
Nadeina set out to develop methods of speech development in Russian as a mother tongue, focusing on improving diction, training in voice quality control, intonation control, the removal of dialect, and speech etiquette. She began with training in the receptive skills of language, i.e. reading and listening, since the interpretation of someone else's language plays an important role in language production. Her studies of students' reading speed of students showed that it varies between 40 and 120 words per minute, which is normally considered very slow. She discovered a strong correlation between speed of reading and speaking skills: the slower a person reads the worse is their ability to speak and has designed exercises to improve reading skills. Nadeina also believes that listening to other people's speech is very important, both to analyse its content and in some cases as an example, so listening skills need to be developed. Many people have poor pronunciation habits acquired as children. On the basis of speech samples from young Russians (male and female, aged 17-22), Nadeina analysed the commonest speech faults - nasalisation, hesitation and hemming at the end of sense-groups, etc. Using a group of twenty listeners, she looked for a correlation between how voice quality is perceived and certain voice quality parameters, e.g. pitch range, tremulousness, fluency, whispering, harshness, sonority, tension and audible breath. She found that the less non-linguistic segment variations in speech appeared, the more attractive the speech was rated. The results are included in a textbook aimed at helping people to improve their oral skills and to communicate ideas to an audience. She believes this will assist Russian officials in their attempts to communicate their ideas to different social spheres, and also foreigners learning Russian.
Children with nonorganic voice disorders (NVDs) are treated mainly using direct voice therapy techniques such as the accent method or glottal attack changes and indirect methods such as vocal hygiene and voice education. However, both approaches tackle only the symptoms and not etiological factors in the family dynamics and therefore often enjoy little success. The aim of the "Bernese Brief Dynamic Intervention" (BBDI) for children with NVD was to extend the effectiveness of pediatric voice therapies with a psychosomatic concept combining short-term play therapy with the child and family dynamic counseling of the parents. This study compares the therapeutic changes in three groups where different procedures were used, before intervention and 1 year afterward: counseling of parents (one to two consultations; n = 24), Brief Dynamic Intervention on the lines of the BBDI (three to five play therapy sessions with the child plus two to four sessions with the parents; n = 20), and traditional voice therapy (n = 22). A Voice Questionnaire for Parents developed by us with 59 questions to be answered on a four-point Likert scale was used to measure the change. According to the parents' assessment, a significant improvement in voice quality was achieved in all three methods. Counseling of parents (A) appears to have led parents to give their child more latitude, for example, they stopped nagging the child or demanding that he/she should behave strictly by the rules. After BBDI (B), the mothers were more responsive to their children's wishes and the children were more relaxed and their speech became livelier. At home, they called out to them less often at a distance, which probably improved parent-child dialog. Traditional voice therapy (C) seems to have had a positive effect on the children's social competence. BBDI seems to have the deepest, widest, and therefore probably the most enduring therapeutic effect on children with NVD.
BioMet®Tools is a set of software applications developed for the biometrical characterization of voice in different fields as voice quality evaluation in laryngology, speech therapy and rehabilitation, education of the singing voice, forensic voice analysis in court, emotional detection in voice, secure access to facilities and services, etc. Initially it was conceived as plain research code to estimate the glottal source from voice and obtain the biomechanical parameters of the vocal folds from the spectral density of the estimate. This code grew to what is now the Glottex®Engine package (G®E). Further demands from users in medical and forensic fields instantiated the development of different Graphic User Interfaces (GUI’s) to encapsulate user interaction with the G®E. This required the personalized design of different GUI’s handling the same G®E. In this way development costs and time could be saved. The development model is described in detail leading to commercial production and distribution. Study cases from its application to the field of laryngology and speech therapy are given and discussed.
Las patologías de la voz se han transformado en los últimos tiempos en una problemática social con cierto calado. La contaminación de las ciudades, hábitos como el de fumar, el uso de aparatos de aire acondicionado, etcétera, contribuyen a ello. Esto alcanza más relevancia en profesionales que utilizan su voz de manera frecuente, como, por ejemplo, locutores, cantantes, profesores o teleoperadores. Por todo ello resultan de especial interés las técnicas de ayuda al diagnóstico que son capaces de extraer conclusiones clínicas a partir de una muestra de la voz grabada con un micrófono, frente a otras invasivas que implican la exploración utilizando laringoscopios, fibroscopios o videoendoscopios, técnicas en cualquier caso mucho más molestas para los pacientes al exigir la introducción parcial del instrumental citado por la garganta, en actuaciones consideradas de tipo quirúrgico. Dentro de aquellas técnicas se ha avanzado mucho en un período de tiempo relativamente corto. En lo que se refiere al diagnóstico de patologías, hemos pasado en los últimos quince años de trabajar principalmente con parámetros extraídos de la señal de voz –tanto en el dominio del tiempo como en el de la frecuencia– y con escalas elaboradas con valoraciones subjetivas realizadas por expertos a hacerlo también con parámetros procedentes de estimaciones de la fuente glótica. La importancia de utilizar la fuente glótica reside, a grandes rasgos, en que se trata de una señal vinculada directamente al estado de la estructura laríngea del locutor y también en que está generalmente menos influida por el tracto vocal que la señal de voz. Es conocido que el tracto vocal guarda más relación con el mensaje hablado, y su presencia dificulta el proceso de detección de patología vocal. Estas estimaciones de la fuente glótica han sido obtenidas a través de técnicas de filtrado inverso desarrolladas por nuestro grupo de investigación. Hemos conseguido, además, profundizar en la naturaleza de la señal glótica: somos capaces de descomponerla y relacionarla con parámetros biomecánicos de los propios pliegues vocales, obteniendo estimaciones de elementos como la masa, la pérdida de energía o la elasticidad del cuerpo y de la cubierta del pliegue, entre otros. De las componentes de la fuente glótica surgen también los denominados parámetros biométricos, relacionados con la forma de la señal, que constituyen por sí mismos una firma biométrica del individuo. También trabajaremos con parámetros temporales, relacionados con las diferentes etapas que se observan dentro de la señal glótica durante un ciclo de fonación. Por último, consideraremos parámetros clásicos de perturbación y energía de la señal. En definitiva, contamos ahora con una considerable cantidad de parámetros glóticos que conforman una base estadística multidimensional, destinada a ser capaz de discriminar personas con voces patológicas o disfónicas de aquellas que no presentan patología en la voz o con voces sanas o normofónicas. Esta tesis doctoral se ocupa de varias cuestiones: en primer lugar, es necesario analizar cuidadosamente estos nuevos parámetros, por lo que ofreceremos una completa descripción estadística de los mismos. También estudiaremos cuestiones como la distribución de los parámetros atendiendo a criterios como el de normalidad estadística de los mismos, ocupándonos especialmente de la diferencia entre las distribuciones que presentan sujetos sanos y sujetos con patología vocal. Para todo ello emplearemos diferentes técnicas estadísticas: generación de elementos y diagramas descriptivos, pruebas de normalidad y diversos contrastes de hipótesis, tanto paramétricos como no paramétricos, que considerarán la diferencia entre los grupos de personas sanas y los grupos de personas con alguna patología relacionada con la voz. Además, nos interesa encontrar relaciones estadísticas entre los parámetros, de cara a eliminar posibles redundancias presentes en el modelo, a reducir la dimensionalidad del problema y a establecer un criterio de importancia relativa en los parámetros en cuanto a su capacidad discriminante para el criterio patológico/sano. Para ello se aplicarán técnicas estadísticas como la Correlación Lineal Bivariada y el Análisis Factorial basado en Componentes Principales. Por último, utilizaremos la conocida técnica de clasificación Análisis Discriminante, aplicada a diferentes combinaciones de parámetros y de factores, para determinar cuáles de ellas son las que ofrecen tasas de acierto más prometedoras. Para llevar a cabo la experimentación se ha utilizado una base de datos equilibrada y robusta formada por doscientos sujetos, cien de ellos pertenecientes al género femenino y los restantes cien al género masculino, con una proporción también equilibrada entre los sujetos que presentan patología vocal y aquellos que no la presentan. Una de las aplicaciones informáticas diseñada para llevar a cabo la recogida de muestras también es presentada en esta tesis. Los distintos estudios estadísticos realizados nos permitirán identificar aquellos parámetros que tienen una mayor contribución a la hora de detectar la presencia de patología vocal. Alguno de los estudios, además, nos permitirá presentar una ordenación de los parámetros en base a su importancia para realizar la detección. Por otra parte, también concluiremos que en ocasiones es conveniente realizar una reducción de la dimensionalidad de los parámetros para mejorar las tasas de detección. Por fin, las propias tasas de detección constituyen quizá la conclusión más importante del trabajo. Todos los análisis presentes en el trabajo serán realizados para cada uno de los dos géneros, de acuerdo con diversos estudios previos que demuestran que los géneros masculino y femenino deben tratarse de forma independiente debido a las diferencias orgánicas observadas entre ambos. Sin embargo, en lo referente a la detección de patología vocal contemplaremos también la posibilidad de trabajar con la base de datos unificada, comprobando que las tasas de acierto son también elevadas. Abstract Voice pathologies have become recently in a social problem that has reached a certain concern. Pollution in cities, smoking habits, air conditioning, etc. contributes to it. This problem is more relevant for professionals who use their voice frequently: speakers, singers, teachers, actors, telemarketers, etc. Therefore techniques that are capable of drawing conclusions from a sample of the recorded voice are of particular interest for the diagnosis as opposed to other invasive ones, involving exploration by laryngoscopes, fiber scopes or video endoscopes, which are techniques much less comfortable for patients. Voice quality analysis has come a long way in a relatively short period of time. In regard to the diagnosis of diseases, we have gone in the last fifteen years from working primarily with parameters extracted from the voice signal (both in time and frequency domains) and with scales drawn from subjective assessments by experts to produce more accurate evaluations with estimates derived from the glottal source. The importance of using the glottal source resides broadly in that this signal is linked to the state of the speaker's laryngeal structure. Unlike the voice signal (phonated speech) the glottal source, if conveniently reconstructed using adaptive lattices, may be less influenced by the vocal tract. As it is well known the vocal tract is related to the articulation of the spoken message and its influence complicates the process of voice pathology detection, unlike when using the reconstructed glottal source, where vocal tract influence has been almost completely removed. The estimates of the glottal source have been obtained through inverse filtering techniques developed by our research group. We have also deepened into the nature of the glottal signal, dissecting it and relating it to the biomechanical parameters of the vocal folds, obtaining several estimates of items such as mass, loss or elasticity of cover and body of the vocal fold, among others. From the components of the glottal source also arise the so-called biometric parameters, related to the shape of the signal, which are themselves a biometric signature of the individual. We will also work with temporal parameters related to the different stages that are observed in the glottal signal during a cycle of phonation. Finally, we will take into consideration classical perturbation and energy parameters. In short, we have now a considerable amount of glottal parameters in a multidimensional statistical basis, designed to be able to discriminate people with pathologic or dysphonic voices from those who do not show pathology. This thesis addresses several issues: first, a careful analysis of these new parameters is required, so we will offer a complete statistical description of them. We will also discuss issues such as distribution of the parameters, considering criteria such as their statistical normality. We will take special care in the analysis of the difference between distributions from healthy subjects and the distributions from pathological subjects. To reach these goals we will use different statistical techniques such as: generation of descriptive items and diagramas, tests for normality and hypothesis testing, both parametric and nonparametric. These latter techniques consider the difference between the groups of healthy subjects and groups of people with an illness related to voice. In addition, we are interested in finding statistical relationships between parameters. There are various reasons behind that: eliminate possible redundancies in the model, reduce the dimensionality of the problem and establish a criterion of relative importance in the parameters. The latter reason will be done in terms of discriminatory power for the criterion pathological/healthy. To this end, statistical techniques such as Bivariate Linear Correlation and Factor Analysis based on Principal Components will be applied. Finally, we will use the well-known technique of Discriminant Analysis classification applied to different combinations of parameters and factors to determine which of these combinations offers more promising success rates. To perform the experiments we have used a balanced and robust database, consisting of two hundred speakers, one hundred of them males and one hundred females. We have also used a well-balanced proportion where subjects with vocal pathology as well as subjects who don´t have a vocal pathology are equally represented. A computer application designed to carry out the collection of samples is also presented in this thesis. The different statistical analyses performed will allow us to determine which parameters contribute in a more decisive way in the detection of vocal pathology. Therefore, some of the analyses will even allow us to present a ranking of the parameters based on their importance for the detection of vocal pathology. On the other hand, we will also conclude that it is sometimes desirable to perform a dimensionality reduction in order to improve the detection rates. Finally, detection rates themselves are perhaps the most important conclusion of the work. All the analyses presented in this work have been performed for each of the two genders in agreement with previous studies showing that male and female genders should be treated independently, due to the observed functional differences between them. However, with regard to the detection of vocal pathology we will consider the possibility of working with the unified database, ensuring that the success rates obtained are also high.
This paper predicts speech synthesis, speech recognition, and speaker recognition technology for the year 2001, and it describes the most important research problems to be solved in order to arrive at these ultimate synthesis and recognition systems. The problems for speech synthesis include natural and intelligible voice production, prosody control based on meaning, capability of controlling synthesized voice quality and choosing individual speaking style, multilingual and multidialectal synthesis, choice of application-oriented speaking styles, capability of adding emotion, and synthesis from concepts. The problems for speech recognition include robust recognition against speech variations, adaptation/normalization to variations due to environmental conditions and speakers, automatic knowledge acquisition for acoustic and linguistic modeling, spontaneous speech recognition, naturalness and ease of human-machine interaction, and recognition of emotion. The problems for speaker recognition are similar to those for speech recognition. The research topics related to all these techniques include the use of articulatory and perceptual constraints and evaluation methods for measuring the quality of technology and systems.
A avaliação perceptivo-auditiva tem papel fundamental no estudo e na avaliação da voz, no entanto, por ser subjetiva está sujeita a imprecisões e variações. Por outro lado, a análise acústica permite a reprodutibilidade de resultados, porém precisa ser aprimorada, pois não analisa com precisão vozes com disfonias mais intensas e com ondas caóticas. Assim, elaborar medidas que proporcionem conhecimentos confiáveis em relação à função vocal resulta de uma necessidade antiga dentro desta linha de pesquisa e atuação clínica. Neste contexto, o uso da inteligência artificial, como as redes neurais artificiais, indica ser uma abordagem promissora. Objetivo: Validar um sistema automático utilizando redes neurais artificiais para a avaliação de vozes rugosas e soprosas. Materiais e métodos: Foram selecionadas 150 vozes, desde neutras até com presença em grau intenso de rugosidade e/ou soprosidade, do banco de dados da Clínica de Fonoaudiologia da Faculdade de Odontologia de Bauru (FOB/USP). Dessas vozes, 23 foram excluídas por não responderem aos critérios de inclusão na amostra, assim utilizaram-se 123 vozes. Procedimentos: avaliação perceptivo-auditiva pela escala visual analógica de 100 mm e pela escala numérica de quatro pontos; extração de características do sinal de voz por meio da Transformada Wavelet Packet e dos parâmetros acústicos: jitter, shimmer, amplitude da derivada e amplitude do pitch; e validação do classificador por meio da parametrização, treino, teste e avaliação das redes neurais artificiais. Resultados: Na avaliação perceptivo-auditiva encontrou-se, por meio do teste Coeficiente de Correlação Intraclasse (CCI), concordâncias inter e intrajuiz excelentes, com p = 0,85 na concordância interjuízes e p variando de 0,87 a 0,93 nas concordâncias intrajuiz. Em relação ao desempenho da rede neural artificial, na discriminação da soprosidade e da rugosidade e dos seus respectivos graus, encontrou-se o melhor desempenho para a soprosidade no subconjunto composto pelo jitter, amplitude do pitch e frequência fundamental, no qual obteve-se taxa de acerto de 74%, concordância excelente com a avaliação perceptivo-auditiva da escala visual analógica (0,80 no CCI) e erro médio de 9 mm. Para a rugosidade, o melhor subconjunto foi composto pela Transformada Wavelet Packet com 1 nível de decomposição, jitter, shimmer, amplitude do pitch e frequência fundamental, no qual obteve-se 73% de acerto, concordância excelente (0,84 no CCI), e erro médio de 10 mm. Conclusão: O uso da inteligência artificial baseado em redes neurais artificiais na identificação, e graduação da rugosidade e da soprosidade, apresentou confiabilidade excelente (CCI > 0,80), com resultados semelhantes a concordância interjuízes. Dessa forma, a rede neural artificial revela-se como uma metodologia promissora de avaliação vocal, tendo sua maior vantagem a objetividade na avaliação.
Objective/Hypothesis: The purpose of this study was to examine respiratory function in a group of patients with muscle tension dysphonia (MTD) Design: Cross-sectional analytical study. Methods: Participants included 15 people with a diagnosis of MTD referred to speech pathology for management of their voice disorder, fiberoptic evidence of glottal or supraglottic constriction during phonation with or without posterior chink, or bowing combined and deviation in perceptual voice quality. A second group of 15 participants with no history of voice disorder served as healthy controls,. Baseline pulmonary function test measures included forced expiratory volume in the first second (FEV1), FVC, FEF25 to 75, FIF50, FEV1/FVC, ratio and FEF50/FIF50 ratio. Hypertonic saline challenge test measures included FEV1 and FIF50 after provocation, close response slope, and provocation dose. Results: Compared with healthy controls, participants with MTD demonstrated a higher prevalence of glottal constriction during inspiration after provocation with nebulized hypertonic saline as demonstrated by a reduction in FIF50 after the hypertonic saline challenge. There was no significant difference between the MTD and healthy control groups in baseline pulmonary function testing. Participants with MTD demonstrated a higher prevalence than healthy controls of abnormal glottic closure during inspiration similar to paradoxical vocal fold movement (PVFM). This suggests that they either had previously undiagnosed coexisting PVFM or that the condition of MTD could be expanded to include descriptions of aberrant glottic function during respiration. This study enhances the understanding of PVFM and MTD by combining research advances made in the fields of otolaryngology and respiratory medicine.
In recent years, the multiparametric approach for evaluating perceptual rating of voice quality has been advocated. This study evaluates the accuracy of predicting perceived overall severity of voice quality with a minimal set of aerodynamic, voice range profile (phonetogram), and acoustic perturbation measures. One hundred and twelve dysphonic persons (93 women and 19 men) with laryngeal pathologies and 41 normal controls (35 women and six men) with normal voices participated in this study. Perceptual severity judgement was carried out by four listeners rating the G (overall grade) parameter of the GRBAS scale.(1) The minimal set of instrumental measures was selected based on the ability of the measure to discriminate between dysphonic and normal voices, and to attain at least a moderate correlation with perceived overall severity. Results indicated that perceived overall severity was best described by maximum phonation time of sustained /a/, peak intraoral pressure of the consonant-vowel /pi/ strings production, voice range profile area, and acoustic jitter. Direct-entry discriminant function analysis revealed that these four voice measures in combination correctly predicted 67.3% of perceived overall severity levels.
Les parents à travers le monde chantent et parlent à leurs bébés. Ces deux types de vocalisations aux enfants préverbaux partagent plusieurs similarités de même que des différences, mais leurs conséquences sur les bébés demeurent méconnues. L’objectif de cette thèse était de documenter l’efficacité relative du chant et de la parole à capter l’attention des bébés sur de courtes périodes de temps (Étude 1) ainsi qu’à réguler l’affect des bébés en maintenant un état de satisfaction sur une période de temps prolongée (Étude 2). La première étude a exploré les réactions attentionnelles des bébés exposés à des enregistrements audio non familiers de chant et de parole. Lors de l’expérience 1, des bébés de 4 à 13 mois ont été exposés à de la parole joyeuse s’adressant au bébé (séquences de syllabes) et des berceuses fredonnées par la même femme. Ils ont écouté significativement plus longtemps la parole, qui contenait beaucoup plus de variabilité acoustique et d’expressivité que les berceuses. Dans l’expérience 2, des bébés d’âges comparables n’ont montré aucune écoute différentielle face à une version parlée ou chantée d’une chanson pour enfant turque, les deux versions étant exprimées de façon joyeuse / heureuse. Les bébés de l’expérience 3, ayant entendu la version chantée de la chanson turque ainsi qu’une version parlée de façon affectivement neutre ou s’adressant à l’adulte, ont écouté significativement plus longtemps la version chantée. Dans l’ensemble, la caractéristique vocale joyeuse plutôt que le mode vocal (chanté versus parlé) était le principal déterminant de l’attention du bébé, indépendamment de son âge. Dans la seconde étude, la régulation affective des bébés a été explorée selon l’exposition à des enregistrements audio non familiers de chant ou de parole. Les bébés ont été exposés à du chant ou de la parole jusqu’à ce qu’ils rencontrent un critère d’insatisfaction exprimée dans le visage. Lors de l’expérience 1, des bébés de 7 à 10 mois ont écouté des enregistrements de paroles s’adressant au bébé, de paroles s’adressant à l’adulte ou du chant dans une langue non familière (turque). Les bébés ont écouté le chant près de deux fois plus longtemps que les paroles avant de manifester de l’insatisfaction. Lors de l’expérience 2, des bébés ont été exposés à des enregistrements de paroles ou de chants issus d’interactions naturelles entre la mère et son bébé, dans une langue familière. Comme dans l’expérience 1, le chant s’adressant au bébé était considérablement plus efficace que les paroles pour retarder l’apparition du mécontentement. La construction temporelle du chant, avec notamment son rythme régulier, son tempo stable et ses répétitions, pourrait jouer un rôle important dans la régulation affective, afin de soutenir l’attention, rehausser la familiarité ou promouvoir l’écoute prédictive et l’entraînement. En somme, les études présentées dans cette thèse révèlent, pour la première fois, que le chant est un outil parental puissant, tout aussi efficace que la parole pour capter l’attention et plus efficace que la parole pour maintenir les bébés dans un état paisible. Ces découvertes soulignent l’utilité du chant dans la vie quotidienne et l’utilité potentielle du chant dans des contextes thérapeutiques variés impliquant des bébés.