913 resultados para Speech genre


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Compression amplification significantly alters the acoustic speech signal in comparison to linear amplification. The central hypothesis of the present study was that the compression settings of a two-channel aid that best preserved the acoustic properties of speech compared to linear amplification would yield the best perceptual results, and that the compression settings that most altered the acoustic properties of speech compared to linear would yield significantly poorer speech perception. On the basis of initial acoustic analysis of the test stimuli recorded through a hearing aid, two different compression amplification settings were chosen for the perceptual study. Participants were 74 adults with mild to moderate sensorineural hearing impairment. Overall, the speech perception results supported the hypothesis. A further aim of the study was to determine if variation in participants' speech perception with compression amplification (compared to linear amplification) could be explained by the individual characteristics of age, degree of loss, dynamic range, temporal resolution, and frequency selectivity; however, no significant relationships were found.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Knowledge of differences in voice and speech characteristics between novice and professional broadcasters is essential for effective education of broadcast journalism students. Because newsreaders rely on optimal voice production, information pertaining to vocal hygiene is also important. The first aim of this study was to compare the voice and speech characteristics of professional newsreaders, student newsreaders and control participants. The second aim was to compare the awareness and use of vocal hygiene across these groups. Professional radio newsreaders, broadcast journalism students and two matched control groups were included in the study. Each participant recorded a news bulletin and completed a questionnaire on vocal hygiene. Data analysis of the recording included objective analysis and perceptual ratings by a panel of three judges. Significant student-professional differences were found. Compared to both the students and the control groups, the professional newsreaders had greater variation in speaking fundamental frequency, a faster rate of speech, fewer pronunciation errors and higher perceptual ratings on vocal quality, emphasis, continuity, phrasing and style of newsreading. Female professional newsreaders had a higher speaking fundamental frequency than both their control participants and the student newsreaders. Comparison of vocal hygiene awareness revealed few significant differences between any of the groups.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper is situated within a theoretical discussion that asserts a positive correlation between indirectness and politeness, and uses these claims as a springboard for an examination of polite ways of issuing directives in eighteenth century Chinese. On the basis of an analysis of directive speech acts from dialogue in the novel Honglou meng, it argues that the concept of indirectness, as it applies to the illocutionary transparency of individual speech acts, has no particular value in the culture and language of eighteenth century Chinese men. It is found that other linguistic devices such as particles, the reduplication of verbs, terms of address, and the presence and sequencing of supportive moves are far more significant to the communication of politeness. The findings suggest a need to rethink current theoretical positions on this subject and move beyond the analysis of individual speech acts to examine the role discourse structure and management play in the enactment of politeness. (C) 2002 Elsevier Science B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In music genre classification, most approaches rely on statistical characteristics of low-level features computed on short audio frames. In these methods, it is implicitly considered that frames carry equally relevant information loads and that either individual frames, or distributions thereof, somehow capture the specificities of each genre. In this paper we study the representation space defined by short-term audio features with respect to class boundaries, and compare different processing techniques to partition this space. These partitions are evaluated in terms of accuracy on two genre classification tasks, with several types of classifiers. Experiments show that a randomized and unsupervised partition of the space, used in conjunction with a Markov Model classifier lead to accuracies comparable to the state of the art. We also show that unsupervised partitions of the space tend to create less hubs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The tongue is the most important and dynamic articulator for speech formation, because of its anatomic aspects (particularly, the large volume of this muscular organ comparatively to the surrounding organs of the vocal tract) and also due to the wide range of movements and flexibility that are involved. In speech communication research, a variety of techniques have been used for measuring the three-dimensional vocal tract shapes. More recently, magnetic resonance imaging (MRI) becomes common; mainly, because this technique allows the collection of a set of static and dynamic images that can represent the entire vocal tract along any orientation. Over the years, different anatomical organs of the vocal tract have been modelled; namely, 2D and 3D tongue models, using parametric or statistical modelling procedures. Our aims are to present and describe some 3D reconstructed models from MRI data, for one subject uttering sustained articulations of some typical Portuguese sounds. Thus, we present a 3D database of the tongue obtained by stack combinations with the subject articulating Portuguese vowels. This 3D knowledge of the speech organs could be very important; especially, for clinical purposes (for example, for the assessment of articulatory impairments followed by tongue surgery in speech rehabilitation), and also for a better understanding of acoustic theory in speech formation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The first and second authors would like to thank the support of the PhD grants with references SFRH/BD/28817/2006 and SFRH/PROTEC/49517/2009, respectively, from Fundação para a Ciência e Tecnol ogia (FCT). This work was partially done in the scope of the project “Methodologies to Analyze Organs from Complex Medical Images – Applications to Fema le Pelvic Cavity”, wi th reference PTDC/EEA- CRO/103320/2008, financially supported by FCT.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The mechanisms of speech production are complex and have been raising attention from researchers of both medical and computer vision fields. In the speech production mechanism, the articulator’s study is a complex issue, since they have a high level of freedom along this process, namely the tongue, which instigates a problem in its control and observation. In this work it is automatically characterized the tongues shape during the articulation of the oral vowels of Portuguese European by using statistical modeling on MR-images. A point distribution model is built from a set of images collected during artificially sustained articulations of Portuguese European sounds, which can extract the main characteristics of the motion of the tongue. The model built in this work allows under standing more clearly the dynamic speech events involved during sustained articulations. The tongue shape model built can also be useful for speech rehabilitation purposes, specifically to recognize the compensatory movements of the articulators during speech production.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cet article présente les premiers résultats d’une recherche s’intéressant à l’articulation entre imitation et invention dans l’écriture chez les jeunes scripteurs. L’étude s’attache à observer les modes de récupération de ressources textuelles fournies par les conditions de production. L’expérimentation a été conçue pour observer l’appropriation d’un genre. Des dispositifs didactiques ont été proposés à plusieurs classes de fin d’école primaire française, à partir de textes littéraires issus de la robinsonnade mis à disposition soit simultanément à l’acte d’écriture soit lors d’une seconde écriture. L’étude montre comment les élèves ont recours à deux procédés contrastés : le réinvestissement du lexique et la reformulation. Les données recueillies mettent en évidence la reprise attendue de mots caractéristiques du genre, et révèlent l’ingéniosité des scripteurs pour restructurer des matériaux langagiers. Certaines stratégies témoignent des difficultés rencontrées par les élèves qui ont eu à interpréter le lexique littéraire puis à le transférer dans leur propre récit. Des modes de reformulation différents coexistent dont on peut offrir une première catégorisation en fonction d’une appropriation plus ou moins réussie du genre considéré.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: In Portugal, the routine clinical practice of speech and language therapists (SLTs) in treating children with all types of speech sound disorder (SSD) continues to be articulation therapy (AT). There is limited use of phonological therapy (PT) or phonological awareness training in Portugal. Additionally, at an international level there is a focus on collecting information on and differentiating between the effectiveness of PT and AT for children with different types of phonologically based SSD, as well as on the role of phonological awareness in remediating SSD. It is important to collect more evidence for the most effective and efficient type of intervention approach for different SSDs and for these data to be collected from diverse linguistic and cultural perspectives. Aims: To evaluate the effectiveness of a PT and AT approach for treatment of 14 Portuguese children, aged 4.0–6.7 years, with a phonologically based SSD. Methods & Procedures: The children were randomly assigned to one of the two treatment approaches (seven children in each group). All children were treated by the same SLT, blind to the aims of the study, over three blocks of a total of 25 weekly sessions of intervention. Outcome measures of phonological ability (percentage of consonants correct (PCC), percentage occurrence of different phonological processes and phonetic inventory) were taken before and after intervention. A qualitative assessment of intervention effectiveness from the perspective of the parents of participants was included. Outcomes & Results: Both treatments were effective in improving the participants’ speech, with the children receiving PT showing a more significant improvement in PCC score than those receiving the AT. Children in the PT group also showed greater generalization to untreated words than those receiving AT. Parents reported both intervention approaches to be as effective in improving their children’s speech. Conclusions & Implications: The PT (combination of expressive phonological tasks, phonological awareness, listening and discrimination activities) proved to be an effective integrated method of improving phonological SSD in children. These findings provide some evidence for Portuguese SLTs to employ PT with children with phonologically based SSD

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The relation of automatic auditory discrimination, measured with MMN, with the type of stimuli has not been well established in the literature, despite its importance as an electrophysiological measure of central sound representation. In this study, MMN response was elicited by pure-tone and speech binaurally passive auditory oddball paradigm in a group of 8 normal young adult subjects at the same intensity level (75 dB SPL). The frequency difference in pure-tone oddball was 100 Hz (standard = 1 000 Hz; deviant = 1 100 Hz; same duration = 100 ms), in speech oddball (standard /ba/; deviant /pa/; same duration = 175 ms) the Portuguese phonemes are both plosive bi-labial in order to maintain a narrow frequency band. Differences were found across electrode location between speech and pure-tone stimuli. Larger MMN amplitude, duration and higher latency to speech were verified compared to pure-tone in Cz and Fz as well as significance differences in latency and amplitude between mastoids. Results suggest that speech may be processed differently than non-speech; also it may occur in a later stage due to overlapping processes since more neural resources are required to speech processing.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In research on Silent Speech Interfaces (SSI), different sources of information (modalities) have been combined, aiming at obtaining better performance than the individual modalities. However, when combining these modalities, the dimensionality of the feature space rapidly increases, yielding the well-known "curse of dimensionality". As a consequence, in order to extract useful information from this data, one has to resort to feature selection (FS) techniques to lower the dimensionality of the learning space. In this paper, we assess the impact of FS techniques for silent speech data, in a dataset with 4 non-invasive and promising modalities, namely: video, depth, ultrasonic Doppler sensing, and surface electromyography. We consider two supervised (mutual information and Fisher's ratio) and two unsupervised (meanmedian and arithmetic mean geometric mean) FS filters. The evaluation was made by assessing the classification accuracy (word recognition error) of three well-known classifiers (knearest neighbors, support vector machines, and dynamic time warping). The key results of this study show that both unsupervised and supervised FS techniques improve on the classification accuracy on both individual and combined modalities. For instance, on the video component, we attain relative performance gains of 36.2% in error rates. FS is also useful as pre-processing for feature fusion. Copyright © 2014 ISCA.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

European Master in Multimedia and Audiovisual Administration

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Trabalho apresentado no âmbito do Mestrado em Engenharia Informática, como requisito parcial para obtenção do grau de Mestre em Engenharia Informática

Relevância:

20.00% 20.00%

Publicador:

Resumo:

As the wireless cellular market reaches competitive levels never seen before, network operators need to focus on maintaining Quality of Service (QoS) a main priority if they wish to attract new subscribers while keeping existing customers satisfied. Speech Quality as perceived by the end user is one major example of a characteristic in constant need of maintenance and improvement. It is in this topic that this Master Thesis project fits in. Making use of an intrusive method of speech quality evaluation, as a means to further study and characterize the performance of speech codecs in second-generation (2G) and third-generation (3G) technologies. Trying to find further correlation between codecs with similar bit rates, along with the exploration of certain transmission parameters which may aid in the assessment of speech quality. Due to some limitations concerning the audio analyzer equipment that was to be employed, a different system for recording the test samples was sought out. Although the new designed system is not standard, after extensive testing and optimization of the system's parameters, final results were found reliable and satisfactory. Tests include a set of high and low bit rate codecs for both 2G and 3G, where values were compared and analysed, leading to the outcome that 3G speech codecs perform better, under the approximately same conditions, when compared with 2G. Reinforcing the idea that 3G is, with no doubt, the best choice if the costumer looks for the best possible listening speech quality. Regarding the transmission parameters chosen for the experiment, the Receiver Quality (RxQual) and Received Energy per Chip to the Power Density Ratio (Ec/N0), these were subject to speech quality correlation tests. Final results of RxQual were compared to those of prior studies from different researchers and, are considered to be of important relevance. Leading to the confirmation of RxQual as a reliable indicator of speech quality. As for Ec/N0, it is not possible to state it as a speech quality indicator however, it shows clear thresholds for which the MOS values decrease significantly. The studied transmission parameters show that they can be used not only for network management purposes but, at the same time, give an expected idea to the communications engineer (or technician) of the end-to-end speech quality consequences. With the conclusion of the work new ideas for future studies come to mind. Considering that the fourth-generation (4G) cellular technologies are now beginning to take an important place in the global market, as the first all-IP network structure, it seems of great relevance that 4G speech quality should be subject of evaluation. Comparing it to 3G, not only in narrowband but also adding wideband scenarios with the most recent standard objective method of speech quality assessment, POLQA. Also, new data found on Ec/N0 tests, justifies further research studies with the intention of validating the assumptions made in this work.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this work an adaptive modeling and spectral estimation scheme based on a dual Discrete Kalman Filtering (DKF) is proposed for speech enhancement. Both speech and noise signals are modeled by an autoregressive structure which provides an underlying time frame dependency and improves time-frequency resolution. The model parameters are arranged to obtain a combined state-space model and are also used to calculate instantaneous power spectral density estimates. The speech enhancement is performed by a dual discrete Kalman filter that simultaneously gives estimates for the models and the signals. This approach is particularly useful as a pre-processing module for parametric based speech recognition systems that rely on spectral time dependent models. The system performance has been evaluated by a set of human listeners and by spectral distances. In both cases the use of this pre-processing module has led to improved results.