995 resultados para voice fundamental frequency
Resumo:
Medical fields requires fast, simple and noninvasive methods of diagnostic techniques. Several methods are available and possible because of the growth of technology that provides the necessary means of collecting and processing signals. The present thesis details the work done in the field of voice signals. New methods of analysis have been developed to understand the complexity of voice signals, such as nonlinear dynamics aiming at the exploration of voice signals dynamic nature. The purpose of this thesis is to characterize complexities of pathological voice from healthy signals and to differentiate stuttering signals from healthy signals. Efficiency of various acoustic as well as non linear time series methods are analysed. Three groups of samples are used, one from healthy individuals, subjects with vocal pathologies and stuttering subjects. Individual vowels/ and a continuous speech data for the utterance of the sentence "iruvarum changatimaranu" the meaning in English is "Both are good friends" from Malayalam language are recorded using a microphone . The recorded audio are converted to digital signals and are subjected to analysis.Acoustic perturbation methods like fundamental frequency (FO), jitter, shimmer, Zero Crossing Rate(ZCR) were carried out and non linear measures like maximum lyapunov exponent(Lamda max), correlation dimension (D2), Kolmogorov exponent(K2), and a new measure of entropy viz., Permutation entropy (PE) are evaluated for all three groups of the subjects. Permutation Entropy is a nonlinear complexity measure which can efficiently distinguish regular and complex nature of any signal and extract information about the change in dynamics of the process by indicating sudden change in its value. The results shows that nonlinear dynamical methods seem to be a suitable technique for voice signal analysis, due to the chaotic component of the human voice. Permutation entropy is well suited due to its sensitivity to uncertainties, since the pathologies are characterized by an increase in the signal complexity and unpredictability. Pathological groups have higher entropy values compared to the normal group. The stuttering signals have lower entropy values compared to the normal signals.PE is effective in charaterising the level of improvement after two weeks of speech therapy in the case of stuttering subjects. PE is also effective in characterizing the dynamical difference between healthy and pathological subjects. This suggests that PE can improve and complement the recent voice analysis methods available for clinicians. The work establishes the application of the simple, inexpensive and fast algorithm of PE for diagnosis in vocal disorders and stuttering subjects.
Resumo:
To examine the basis of emotional changes to the voice, physiological and electroglottal measures were combined with acoustic speech analysis of 30 men performing a computer task in which they lost or gained points under two levels of difficulty. Predictions of the main effects of difficulty and reward on the voice were not borne out by the data. Instead, vocal changes depended largely on interactions between gain versus loss and difficulty. The rate at which the vocal folds open and close (fundamental frequency; f0) was higher for loss than for gain when difficulty was high, but not when difficulty was low. Electroglottal measures revealed that f0 changes corresponded to shorter glottal open times for the loss conditions. Longer closed and shorter open phases were consistent with raised laryngeal tension in difficult loss conditions. Similarly, skin conductance indicated higher sympathetic arousal in loss than gain conditions, particularly when difficulty was high. The results provide evidence of the physiological basis of affective vocal responses, confirming the utility of measuring physiology and voice in the study of emotion.
Resumo:
The aim of this thesis is to investigate computerized voice assessment methods to classify between the normal and Dysarthric speech signals. In this proposed system, computerized assessment methods equipped with signal processing and artificial intelligence techniques have been introduced. The sentences used for the measurement of inter-stress intervals (ISI) were read by each subject. These sentences were computed for comparisons between normal and impaired voice. Band pass filter has been used for the preprocessing of speech samples. Speech segmentation is performed using signal energy and spectral centroid to separate voiced and unvoiced areas in speech signal. Acoustic features are extracted from the LPC model and speech segments from each audio signal to find the anomalies. The speech features which have been assessed for classification are Energy Entropy, Zero crossing rate (ZCR), Spectral-Centroid, Mean Fundamental-Frequency (Meanf0), Jitter (RAP), Jitter (PPQ), and Shimmer (APQ). Naïve Bayes (NB) has been used for speech classification. For speech test-1 and test-2, 72% and 80% accuracies of classification between healthy and impaired speech samples have been achieved respectively using the NB. For speech test-3, 64% correct classification is achieved using the NB. The results direct the possibility of speech impairment classification in PD patients based on the clinical rating scale.
Resumo:
OBJETIVO: estudar o valor da freqüência fundamental e suas variações presentes no choro de dor de recém-nascidos. MÉTODOS: foram gravadas as emissões de 111 recém-nascidos de termo e saudáveis, com idade de 24 a 72 horas durante procedimento da punção venosa periférica. A análise acústica foi realizada por meio dos softwares VOXMETRIA 1.1 com extração do valor da freqüência fundamental e GRAM 5.7 para verificar a ocorrência de variações da freqüência fundamental como quebras, bitonalidade e freqüência hiperaguda. A escala de dor NIPS foi realizada no momento da punção. A análise estatística é descritiva com extração dos valores de média, desvio-padrão e freqüência de ocorrência dos eventos. RESULTADOS: os recém-nascidos apresentaram 100% de suas emissões com variações de freqüência, ou seja, quebras e bitonalidade. A freqüência hiperaguda foi encontrada em 34,2% dos recém-nascidos. CONCLUSÃO: por meio do choro, o recém-nascido comunica sua dor. A emissão de dor do recém-nascido é tensa e estridente, com freqüência fundamental aguda e variações encontradas no traçado espectrográfico, como quebras, bitonalidade e freqüência hiperaguda. Tais características são importantes para chamar a atenção do adulto no pronto atendimento ao recém-nascido e auxiliar na avaliação de dor durante um procedimento.
Resumo:
Spectrographic analysis of male actors' voices showed a cluster, the actor's formant (AF), which is related to the perception of good and projected voice quality. To date, similar phenomena have not been described in the voices of actresses. Therefore, the objective of the current investigation was to compare actresses' and nonactresses' voices through acoustic analysis to verify the existence of the AF cluster or the strategies used to produce the performing voice. Thirty actresses and 30 nonactresses volunteered as subjects in the present study. All subjects read a 40-second text at both habitual and loud levels. Praat (v.5.1) was then used to analyze equivalent sound pressure level (Leq), speaking fundamental frequency (SFF), and in the long-term average spectrum window, the difference between the amplitude level of the fundamental frequency and first formant (L1 - L0), the spectral tilt (alpha ratio), and the amplitude and frequency of the AF region. Significant differences between the groups, in both levels, were observed for SFF and L1 - L0, with actresses presenting lower values. There were no significant differences between groups for Leq or alpha ratio at either level. There was no evidence of an AF cluster in the actresses' voices. Voice projection for this group of actresses seemed to be mainly a result of a laryngeal setting instead of vocal tract resonances.
Resumo:
This study investigates the possible differences between actors' and nonactors' vocal projection strategies using acoustic and perceptual analyses. A total of 11 male actors and 10 male nonactors volunteered as subjects, reading an extended text sample in habitual, moderate, and loud levels. The samples were analyzed for sound pressure level (SPL), alpha ratio (difference between the average SPL of the 1-5 kHz region and the average SPL of the 50 Hz-1 kHz region), fundamental frequency (F0), and long-term average spectrum (LTAS). Through LTAS, the mean frequency of the first formant (171) range, the mean frequency of the actor's formant, the level differences between the F1 frequency region and the F0 region (L1-L0), and the level differences between the strongest peak at 0-1 kHz and that at 3-4 kHz were measured. Eight voice specialists evaluated perceptually the degree of projection, loudness, and tension in the samples. The actors had a greater alpha ratio, stronger level of the actor's formant range, and a higher degree of perceived projection and loudness in all loudness levels. SPL, however, did not differ significantly between the actors and nonactors, and no differences were found in the mean formant frequencies ranges. The alpha ratio and the relative level of the actor's formant range seemed to be related to the degree of perceived loudness. From the physiological point of view, a more favorable glottal setting' providing a higher glottal closing speed, may be characteristic of these actors' projected voices. So, the projected voices, in this group of actors, were more related to the glottic source than to the resonance of the vocal tract.
Resumo:
Considerando a crescente utilização de técnicas de processamento digital de sinais em aplicações de sistemas eletrônicos e ou de potência, este artigo discute o uso da Transformada Discreta de Fourier Recursiva (TDFR) para identificação do ângulo de fase, da freqüência e da amplitude das tensões fundamentais da rede, independente de distorções na forma de onda ou de transitórios na amplitude. Será discutido que, se a freqüência fundamental das tensões medidas coincide com a freqüência a qual a TDF foi projetada, um simples algoritmo TDFR é completamente capaz de fornecer as informações requeridas de fase, freqüência e amplitude. Dois algoritmos adicionais são propostos para garantir seu desempenho correto quando a freqüência difere do seu valor nominal: um deles para a correção do erro de fase do sinal de saída e outro para identificação da amplitude do componente fundamental. Além disto, destaca-se que através dos algoritmos propostos, independentemente do sinal de entrada, a identificação do componente fundamental pode ser realizada em, no máximo, 2 ciclos da rede. Uma análise dos resultados evidenciados pela TDFR foi desenvolvida através de simulações computacionais. Também serão apresentados resultados experimentais referentes ao sincronismo de um gerador síncrono com a rede elétrica, através dos sinais fornecidos pela TDFR.
Resumo:
Introduction: Study the characteristics of pain vocal emission of newborns during venepuncture through acoustic analysis and relate it to NIPS pain scale and some variables of the newborns.Methods: Emissions of 111 healthy term newborns were recorded, whose lifetime varied from 24 to 72 h. The acoustic analysis was realized with GRAM 5.7 software verifying the occurrence of tense strangled voice quality, sounds, concentration of acoustic energy, breaks, double harmonic breaks and frequency instability, type of phonation, vocal attack and cough. The NIPS scale was realized during venepuncture and descriptive statistical analysis and correlation through Spearman test.Results: Hundred percent of the emissions had guttural sounds, vowels, hard vocal attack, frequency, breaks, double harmonic breaks and tense strangled voice quality; 34.2% higher fundamental frequency; 62.2% periods of emission absence and 100% occurrence of tracing instability, concentration of acoustic energy, inspiratory and expiratory phonation. The cough occurred in 14.4%. The signs of vocal tract constriction associated with all. The parameters. There was a negative correlation between the higher fundamental frequencies and the weight of newborns and positive correlation between cough and NIPS score.Conclusions: the newborn pain emission is tense and strident, the modifications of frequency and spectrographic tracing and the presence of sounds show laryngeal and vocal tract participation. The smaller the newborn weight, the bigger the presence of higher fundamental frequency with tense strangled voice quality and the bigger the NIPS score, the more frequent the cough. Such characteristics make pain crying peculiar, helping in the evaluation of pain during a procedure. (c) 2006 Elsevier B.V.. All rights reserved.
Resumo:
This paper proposes a novel and simple positive sequence detector (PSD), which is inherently self-adjustable to fundamental frequency deviations by means of a software-based PLL (Phase Locked Loop). Since the proposed positive sequence detector is not based on Fortescue's classical decomposition and no special input filtering is needed, its dynamic response may be as fast as one fundamental cycle. The digital PLL ensures that the positive sequence components can be calculated even under distorted waveform conditions and fundamental frequency deviations. For the purpose of validating the proposed models, the positive sequence detector has been implemented in a PC-based Power Quality Monitor and experimental results illustrate its good performance. The PSD algorithm has also been evaluated in the control loop of a Series Active Filter and simulation results demonstrate its effectiveness in a closed-loop system. Moreover, considering single-phase applications, this paper also proposes a general single-phase PLL and a Fundamental Wave Detector (FWD) immune to frequency variations and waveform distortions. © 2005 IEEE.
Electroglottographic analysis of actresses and nonactresses' voices in different levels of intensity
Resumo:
Background: Previous studies with long-term average spectrum (LTAS) showed the importance of the glottal source for understanding the projected voices of actresses. In this study, electroglottographic (EGG) analysis was used to investigate the contribution of the glottal source to the projected voice, comparing actresses and nonactresses' voices, in different levels of intensity. Method: Thirty actresses and 30 nonactresses sustained vowels in habitual, moderate, and loud intensity levels. The EGG variables were contact quotient (CQ), closing quotient (QCQ), and opening quotient (QOQ). Other variables were sound pressure level (SPL) and fundamental frequency (F0). A KayPENTAX EGG was used. Variables were inputted in a general linear model. Results/Discussion: Actresses showed significantly higher values for SPL, in all levels, and both groups increased SPL significantly while changing from habitual to moderate and further to loud. There were no significant differences between groups for EGG quotients. There were significant differences between the levels only for F0 and CQ for both groups. Conclusion: SPL was significantly higher among actresses in all intensity levels, but in the EGG analysis, no differences were found. This apparently weak contribution of the glottal source in the supposedly projected voices of actresses, contrary to previous LTAS studies, might be because of a higher subglottal pressure or perhaps greater vocal tract contribution in SPL. Results from the present study suggest that trained subjects did not produce a significant higher SPL than untrained individuals by increasing the cost in terms of higher vocal fold collision and hence more impact stress. Future researches should explore the difference between trained and nontrained voices by aerodynamic measurements to evaluate the relationship between physiologic findings and the acoustic and EGG data. Moreover, further studies should consider both types of vocal tasks, sustained vowel and running speech, for both EGG and LTAS analysis. © 2013 The Voice Foundation.
Resumo:
Objectives: Vocally trained actresses are expected to have more vocal economy than nonactresses. Therefore, we hypothesize that there will be differences in the electroglottogram-based voice economy parameter quasi-output cost ratio (QOCR) between actresses and nonactresses. This difference should remain across different levels of intensity. Methods: A total of 30 actresses and 30 nonactresses were recruited for this study. Participants from both groups were required to sustain the vowels /a/, /i/, and /u/, in habitual, moderate, and high intensity levels. Acoustic variables such as sound pressure level (SPL), fundamental frequency (F0), and glottal contact quotient (CQ) were obtained. The QOCR was then calculated. Results: There were no significant differences among the groups for QOCR. Positive correlations were observed for QOCR versus SPL and QOCR versus F0 in all intensity levels. Negative correlation was found between QOCR and CQ in all intensity levels. Considering the differences among intensity levels, from habitual to moderate and from moderate to loud, only the CQ did not differ significantly. The QOCR, SPL, and F0 presented significant differences throughout the different intensity levels. Conclusion: The QOCR did not reflect the level of vocal training when comparing trained and nontrained female subjects in the present study. Both groups demonstrated more vocal economy in moderate and high intensity levels owing to more voice output without an increase in glottal adduction. © 2013 The Voice Foundation.
Resumo:
Pós-graduação em Educação - FFC
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Purpose. The present study aimed to compare actors/actresses's voices and vocally trained subjects through aerodynamic and electroglottographic (EGG) analyses. We hypothesized that glottal and breathing functions would reflect technical and physiological differences between vocally trained and untrained subjects.Methods. Forty participants with normal voices participated in this study (20 professional theater actors and 20 untrained participants). In each group, 10 male and 10 female subjects were assessed. All participants underwent aerodynamic and EGG assessment of voice. From the Phonatory Aerodynamic System, three protocols were used: comfortable sustained phonation with EGG, voice efficiency with EGG, and running speech. Contact quotient was calculated from EGG. All phonatory tasks were produced at three different loudness levels. Mean sound pressure level and fundamental frequency were also assessed. Univariate, multivariate, and correlation statistical analyses were performed.Results. Main differences between vocally trained and untrained participants were found in the following variables: mean sound pressure level, phonatory airflow, subglottic pressure, inspiratory airflow duration, inspiratory airflow, and inspiratory volume. These variables were greater for trained participants. Mean pitch was found to be lower for trained voices.Conclusions. The glottal source seemed to have a weak contribution when differentiating the training status in speaking voice. More prominent changes between vocally trained and untrained participants are demonstrated in respiratory-related variables. These findings may be related to better management of breathing function (better breath support).
Resumo:
Pós-graduação em Bases Gerais da Cirurgia - FMB