29 resultados para Formants
Resumo:
How speech is separated perceptually from other speech remains poorly understood. Recent research suggests that the ability of an extraneous formant to impair intelligibility depends on the modulation of its frequency, but not its amplitude, contour. This study further examined the effect of formant-frequency variation on intelligibility by manipulating the rate of formant-frequency change. Target sentences were synthetic three-formant (F1?+?F2?+?F3) analogues of natural utterances. Perceptual organization was probed by presenting stimuli dichotically (F1?+?F2C?+?F3C; F2?+?F3), where F2C?+?F3C constitute a competitor for F2 and F3 that listeners must reject to optimize recognition. Competitors were derived using formant-frequency contours extracted from extended passages spoken by the same talker and processed to alter the rate of formant-frequency variation, such that rate scale factors relative to the target sentences were 0, 0.25, 0.5, 1, 2, and 4 (0?=?constant frequencies). Competitor amplitude contours were either constant, or time-reversed and rate-adjusted in parallel with the frequency contour. Adding a competitor typically reduced intelligibility; this reduction increased with competitor rate until the rate was at least twice that of the target sentences. Similarity in the results for the two amplitude conditions confirmed that formant amplitude contours do not influence across-formant grouping. The findings indicate that competitor efficacy is not tuned to the rate of the target sentences; most probably, it depends primarily on the overall rate of frequency variation in the competitor formants. This suggests that, when segregating the speech of concurrent talkers, differences in speech rate may not be a significant cue for across-frequency grouping of formants.
Resumo:
This paper presents the results of a multivariate spatial analysis of 38 vowel formant variables in the language of 402 informants from 236 cities from across the contiguous United States, based on the acoustic data from the Atlas of North American English (Labov, Ash & Boberg, 2006). The results of the analysis both confirm and challenge the results of the Atlas. Most notably, while the analysis identifies similar patterns as the Atlas in the West and the Southeast, the analysis finds that the Midwest and the Northeast are distinct dialect regions that are considerably stronger than the traditional Midland and Northern dialect region indentified in the Atlas. The analysis also finds evidence that a western vowel shift is actively shaping the language of the Western United States.
Resumo:
How speech is separated perceptually from other speech remains poorly understood. Recent research indicates that the ability of an extraneous formant to impair intelligibility depends on the variation of its frequency contour. This study explored the effects of manipulating the depth and pattern of that variation. Three formants (F1+F2+F3) constituting synthetic analogues of natural sentences were distributed across the 2 ears, together with a competitor for F2 (F2C) that listeners must reject to optimize recognition (left = F1+F2C; right = F2+F3). The frequency contours of F1 − F3 were each scaled to 50% of their natural depth, with little effect on intelligibility. Competitors were created either by inverting the frequency contour of F2 about its geometric mean (a plausibly speech-like pattern) or using a regular and arbitrary frequency contour (triangle wave, not plausibly speech-like) matched to the average rate and depth of variation for the inverted F2C. Adding a competitor typically reduced intelligibility; this reduction depended on the depth of F2C variation, being greatest for 100%-depth, intermediate for 50%-depth, and least for 0%-depth (constant) F2Cs. This suggests that competitor impact depends on overall depth of frequency variation, not depth relative to that for the target formants. The absence of tuning (i.e., no minimum in intelligibility for the 50% case) suggests that the ability to reject an extraneous formant does not depend on similarity in the depth of formant-frequency variation. Furthermore, triangle-wave competitors were as effective as their more speech-like counterparts, suggesting that the selection of formants from the ensemble also does not depend on speech-specific constraints.
Resumo:
How speech is separated perceptually from other speech remains poorly understood. Recent research indicates that the ability of an extraneous formant to impair intelligibility depends on the variation of its frequency contour. This study explored the effects of manipulating the depth and pattern of that variation. Three formants (F1+F2+F3) constituting synthetic analogues of natural sentences were distributed across the 2 ears, together with a competitor for F2 (F2C) that listeners must reject to optimize recognition (left = F1+F2C; right = F2+F3). The frequency contours of F1 - F3 were each scaled to 50% of their natural depth, with little effect on intelligibility. Competitors were created either by inverting the frequency contour of F2 about its geometric mean (a plausibly speech-like pattern) or using a regular and arbitrary frequency contour (triangle wave, not plausibly speech-like) matched to the average rate and depth of variation for the inverted F2C. Adding a competitor typically reduced intelligibility; this reduction depended on the depth of F2C variation, being greatest for 100%-depth, intermediate for 50%-depth, and least for 0%-depth (constant) F2Cs. This suggests that competitor impact depends on overall depth of frequency variation, not depth relative to that for the target formants. The absence of tuning (i.e., no minimum in intelligibility for the 50% case) suggests that the ability to reject an extraneous formant does not depend on similarity in the depth of formant-frequency variation. Furthermore, triangle-wave competitors were as effective as their more speech-like counterparts, suggesting that the selection of formants from the ensemble also does not depend on speech-specific constraints. © 2014 The Author(s).
Resumo:
The metallic voice is usually confused with ring or nasality by singers and nontrained listeners. who are not used to perceptual vocal analysis. They believe a metallic voice results from a rise in fundamental frequency. A diagnostic error in this aspect may lead to lowering pitch, an incorrect procedure that Could Cause vocal overload and fatigue. The purpose of this article is to Study the quality of metallic voice considering the correlation between information of the physiological and acoustic plans, based on a perceptive consensual assumption. Fiberscopic video pharyngolaryngoscopy was performed on 21 professional singers while speaking vowel [e]-in normal and metallic modes to observe muscular movements and structural changes of the velopharynx, pharynx, and larynx. Vocal samples captured simultaneously to the fiberscopic examination were acoustically analyzed. Frequency and amplitude of the first four formants (F(1), F(2), F(3), and F(4)) were extracted by means of linear predictor coefficients (LPC) Spectrum and were statistically analyzed. Vocal tract adjustments such as velar lowering, pharyngeal wall narrowing, laryngeal rise, aryepiglottic, and lateral laryngeal constrictions were frequently found: there were no significant changes in frequency and amplitude of F(1) in the metallic voiced there were significant increases in amplitudes of F(2), F(3), and F(4) and in frequency for F, metallic Voice perceived as louder was correlated to an increase ill amplitude of F(3) and F(4). Physiological adjustments of velopharynx, pharynx, and larynx are combined in characterizing the metallic voice and can be acoustically related to changes in formant pattern.
Resumo:
Objective: To assess, in patients undergoing glossectomy, the influence of the palatal augmentation prosthesis on the speech intelligibility and acoustic spectrographic characteristics of the formants of oral vowels in Brazilian Portuguese, specifically the first 3 formants (F1 [/a,e,u/], F2 [/o,o,u/], and F3 [/a,o/]). Design: Speech evaluation with and without a palatal augmentation prosthesis using blinded randomized listener judgments. Setting: Tertiary referral center. Patients: Thirty-six patients (33 men and 3 women) aged 30 to 80 (mean [SD], 53.9 [10.5]) years underwent glossectomy (14, total glossectomy; 12, total glossectomy and partial mandibulectomy; 6, hemiglossectomy; and 4, subtotal glossectomy) with use of the augmentation prosthesis for at least 3 months before inclusion in the study. Main Outcome Measures: Spontaneous speech intel-ligibility (assessed by expert listeners using a 4-category scale) and spectrographic formants assessment. Results: We found a statistically significant improvement of spontaneous speech intelligibility and the average number of correctly identified syllables with the use of the prosthesis (P < .05). Statistically significant differences occurred for the F1 values of the vowels /a,e,u/; for F2 values, there was a significant difference of the vowels /o,o,u/; and for F3 values, there was a significant difference of the vowels la,61 (P < .001). Conclusions: The palatal augmentation prosthesis improved the intelligibility of spontaneous speech and syllables for patients who underwent glossectomy. It also increased the F2 and F3 values for all vowels and the F I values for the vowels /o,o,u/. This effect brought the values of many vowel formants closer to normal.
Resumo:
Aquest working paper és un estudi preliminar que té com a objectiu analitzar acústicament diverses variables fonètiques amb la finalitat forense d'identificació de parlants: les freqüències dels dos primers formants de la vocal [ə] quan s'utilitza com a falca, la duració dels segments [m] tenint en compte el context sil·làbic i, finalment, l'estudi dels pics de freqüències en les fricatives estridents sordes -[s]- utilitzant l'anàlisi LPC. Els resultats revelen diferències estadísticament significatives entre els parlants.
Resumo:
This thesis is an experimental study regarding the identification and discrimination of vowels, studied using synthetic stimuli. The acoustic attributes of synthetic stimuli vary, which raises the question of how different spectral attributes are linked to the behaviour of the subjects. The spectral attributes used are formants and spectral moments (centre of gravity, standard deviation, skewness and kurtosis). Two types of experiments are used, related to the identification and discrimination of the stimuli, respectively. The discrimination is studied by using both the attentive procedures that require a response from the subject, and the preattentive procedures that require no response. Together, the studies offer information about the identification and discrimination of synthetic vowels in 15 different languages. Furthermore, this thesis discusses the role of various spectral attributes in the speech perception processes. The thesis is divided into three studies. The first is based only on attentive methods, whereas the other two concentrate on the relationship between identification and discrimination experiments. The neurophysiological methods (EEG recordings) reveal the role of attention in processing, and are used in discrimination experiments, while the results reveal differences in perceptual processes based on the language, attention and experimental procedure.
Resumo:
In a leading service economy like India, services lie at the very center of economic activity. Competitive organizations now look not only at the skills and knowledge, but also at the behavior required by an employee to be successful on the job. Emotionally competent employees can effectively deal with occupational stress and maintain psychological well-being. This study explores the scope of the first two formants and jitter to assess seven common emotional states present in the natural speech in English. The k-means method was used to classify emotional speech as neutral, happy, surprised, angry, disgusted and sad. The accuracy of classification obtained using raw jitter was more than 65 percent for happy and sad but less accurate for the others. The overall classification accuracy was 72% in the case of preprocessed jitter. The experimental study was done on 1664 English utterances of 6 females. This is a simple, interesting and more proactive method for employees from varied backgrounds to become aware of their own communication styles as well as that of their colleagues' and customers and is therefore socially beneficial. It is a cheap method also as it requires only a computer. Since knowledge of sophisticated software or signal processing is not necessary, it is easy to analyze
Resumo:
Background: Voice processing in real-time is challenging. A drawback of previous work for Hypokinetic Dysarthria (HKD) recognition is the requirement of controlled settings in a laboratory environment. A personal digital assistant (PDA) has been developed for home assessment of PD patients. The PDA offers sound processing capabilities, which allow for developing a module for recognition and quantification HKD. Objective: To compose an algorithm for assessment of PD speech severity in the home environment based on a review synthesis. Methods: A two-tier review methodology is utilized. The first tier focuses on real-time problems in speech detection. In the second tier, acoustics features that are robust to medication changes in Levodopa-responsive patients are investigated for HKD recognition. Keywords such as Hypokinetic Dysarthria , and Speech recognition in real time were used in the search engines. IEEE explorer produced the most useful search hits as compared to Google Scholar, ELIN, EBRARY, PubMed and LIBRIS. Results: Vowel and consonant formants are the most relevant acoustic parameters to reflect PD medication changes. Since relevant speech segments (consonants and vowels) contains minority of speech energy, intelligibility can be improved by amplifying the voice signal using amplitude compression. Pause detection and peak to average power rate calculations for voice segmentation produce rich voice features in real time. Enhancements in voice segmentation can be done by inducing Zero-Crossing rate (ZCR). Consonants have high ZCR whereas vowels have low ZCR. Wavelet transform is found promising for voice analysis since it quantizes non-stationary voice signals over time-series using scale and translation parameters. In this way voice intelligibility in the waveforms can be analyzed in each time frame. Conclusions: This review evaluated HKD recognition algorithms to develop a tool for PD speech home-assessment using modern mobile technology. An algorithm that tackles realtime constraints in HKD recognition based on the review synthesis is proposed. We suggest that speech features may be further processed using wavelet transforms and used with a neural network for detection and quantification of speech anomalies related to PD. Based on this model, patients' speech can be automatically categorized according to UPDRS speech ratings.
Resumo:
TEMA: o controle do tamanho da abertura velofaríngea é uma variável importante na caracterização do perfil acústico da fala hipernasal. OBJETIVO: investigar os aspectos espectrais das frequências de F1, F2, F3, formante nasal(FN) e anti-formante, em Hertz, para as vogais [a] e [ã] na presença de aberturas feitas no bulbo de réplicas da prótese de palato de uma paciente com insuficiência velofaríngea. MÉTODO: gravações de produções de quatro palavras (pato/mato e panto/manto) inseridas em frase veículo foram obtidas em cinco condições de funcionamento velofaríngeo: prótese sem aberturas (condição controle: CC), prótese com abertura de 10mm² no bulbo (condição experimental - CE10), com abertura de 20mm² (condição experimental - CE20), com abertura de 30mm² (condição experimental - CE30), e sem prótese (condição experimental aberta - CEA). Cinco fonoaudiólogos julgaram a nasalidade de fala ao vivo, durante a leitura de um texto oral. As gravações foram usadas para análise espectral. RESULTADOS: valores de F1 foram significativamente mais altos para [a] que para [ã] em todas as condições. Valores de F2 para [a] em CE20 e CE30 foram significantemente mais baixos que nas outras condições, aproximando-se dos valores para [ã]. Valores de F3 não foram significativamente diferentes nas diferentes condições. Houve relação entre os achados de FN e anti-formantes e a percepção de nasalidade para as condições CE10 e CE20. CONCLUSÃO: foram observadas mudanças significativas nos valores espectrais estudados de acordo com alterações no tamanho da abertura velofaríngea.
Resumo:
This paper deals with the phonological definition of trills in Brazilian Portuguese. The phonemic existence of two distinctive R's, one soft the other strong, is taken for granted. After reviewing the ideas of some Portuguese-speaking phoneticians on this matter, 146 occurrences of R's - recorded by two informants - were acoustically analyzed, the formants' general aspect and the length of the R segments were studied in the resulting spectrograms. The phonological table displayed in the conclusion does not include any trill. The soft phoneme /r/, instead, is classified as an interrupted apico-alveolar tap with a retroflex allophone. Naturalization between these two units is discarded.t
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
In the area of Phonetics, current studies are mainly geared toward acoustic analysis of the speech. The technology of personal computers and available software made these analyses easier to be carried out. The following work used the software called PRAAT. Besides showing how it helps the researcher, the aim of this work is to bring new data for future consultation in this field, allowing comparisons and discussions about this subject. At FONAC, a place where a Project of Departmental Traineeship is run, coordinated by Prof. Dr. Luiz Carlos Cagliari and aimed at training on Acoustic Phonetics for undergraduates and graduates, there is a good quality recording, in which a speaker of the paulista dialect reads an excerpt of Michael Ende and Annegert Fuchshuber’s book named ‘The Dream-eater’ twice. The data obtained through it were analyzed directly or by statistic procedures. Tables and charts, created from these data, helped to visualize the similarities and differences between the vowels allowing an easier comprehension of the articulatory phenomenon. With the formants, specifically, prototypical values for the vowels of Portuguese and the dialect in study were obtained