997 resultados para stage speech
Resumo:
Bowen and colleagues’ methods and conclusions raise concerns.1 At best, the trial evaluates the variability in current practice. In no way is it a robust test of treatment. Two communication impairments (aphasia and dysarthria) were included. In the post-acute stage spontaneous recovery is highly unpredictable, and changes in the profile of impairment during this time are common.2 Both impairments manifest in different forms,3 which may be more or less responsive to treatment. A third kind of impairment, apraxia of speech, was not excluded but was not targeted in therapy. All three impairments can and do co-occur. Whether randomised controlled trial designs can effectively cope with such complex disorders has been discussed elsewhere.4 Treatment was defined within terms of current practice but was unconstrained. Therefore, the treatment group would have received a variety of therapeutic approaches and protocols, some of which may indeed be ineffective. Only 53% of the contact time with a speech and language therapist was direct (one to one), the rest was impairment based therapy. In contrast, all of the visitors’ time was direct contact, usually in conversation. In both groups, the frequency and length of contact time varied. We already know that the transfer from impairment based therapy to functional communication can be limited and varies across individuals.5 However, it is not possible to conclude from this trial that one to one impairment based therapy should be replaced. For that, a well defined impairment therapy protocol must be directly compared with a similarly well defined functional communication therapy, with an attention control.
Resumo:
We present a new approach for corpus-based speech enhancement that significantly improves over a method published by Xiao and Nickel in 2010. Corpus-based enhancement systems do not merely filter an incoming noisy signal, but resynthesize its speech content via an inventory of pre-recorded clean signals. The goal of the procedure is to perceptually improve the sound of speech signals in background noise. The proposed new method modifies Xiao's method in four significant ways. Firstly, it employs a Gaussian mixture model (GMM) instead of a vector quantizer in the phoneme recognition front-end. Secondly, the state decoding of the recognition stage is supported with an uncertainty modeling technique. With the GMM and the uncertainty modeling it is possible to eliminate the need for noise dependent system training. Thirdly, the post-processing of the original method via sinusoidal modeling is replaced with a powerful cepstral smoothing operation. And lastly, due to the improvements of these modifications, it is possible to extend the operational bandwidth of the procedure from 4 kHz to 8 kHz. The performance of the proposed method was evaluated across different noise types and different signal-to-noise ratios. The new method was able to significantly outperform traditional methods, including the one by Xiao and Nickel, in terms of PESQ scores and other objective quality measures. Results of subjective CMOS tests over a smaller set of test samples support our claims.
Resumo:
From the moment of their birth, a person's life is determined by their sex. Ms. Goroshko wants to know why this difference is so striking, why society is so concerned to sustain it, and how it is able to persist even when certain national or behavioural stereotypes are erased between people. She is convinced of the existence of not only social, but biological differences between men and women, and set herself the task, in a manuscript totalling 126 pages, written in Ukrainian and including extensive illustrations, of analysing these distinctions as they are manifested in language. She points out that, even before 1900, certain stylistic differences between the ways that men and women speak had been noted. Since then it has become possible, for instance in the case of Japanese, to point to examples of male and female sub-languages. In general, one can single out the following characteristics. Males tend to write with less fluency, to refer to events in a verb-phrase, to be time-oriented, to involve themselves more in their references to events, to locate events in their personal sphere of activity, and to refer less to others. Therefore, concludes Ms Goroshko, the male is shown to be more active, more ego-involved in what he does, and less concerned about others. Women, in contrast, were more fluent, referred to events in a noun-phrase, were less time-oriented, tended to be less involved in their event-references, locate events within their interactive community and refer more to others. They spent much more time discussing personal and domestic subjects, relationship problems, family, health and reproductive matters, weight, food and clothing, men, and other women. As regards discourse strategies, Ms Goroshko notes the following. Men more often begin a conversation, they make more utterances, these utterances are longer, they make more assertions, speak less carefully, generally determine the topic of conversation, speak more impersonally, use more vulgar expressions, and use fewer diminutives and more imperatives. Women's speech strategies, apart from being the opposite of those enumerated above, also contain more euphemisms, polite forms, apologies, laughter and crying. All of the above leads Ms. Goroshko to conclude that the differences between male and female speech forms are more striking than the similarities. Furthermore she is convinced that the biological divergence between the sexes is what generates the verbal divergence, and that social factors can only intensify or diminish the differentiation in verbal behaviour established by the sex of a person. Bearing all this in mind, Ms Goroshko set out to construct a grammar of male and female styles of speaking within Russian. One of her most important research tools was a certain type of free association test. She took a list comprising twelve stimuli (to love, to have, to speak, to fuck, a man, a woman, a child, the sky, a prayer, green, beautiful) and gave it to a group of participants specially selected, according to a preliminary psychological testing, for the high levels of masculinity or femininity they displayed. Preliminary responses revealed that the female reactions were more diverse than the male ones, there were more sentences and word combinations in the female reactions, men gave more negative responses to the stimulus and sometimes didn't want to react at all, women reacted more to adjectives and men to nouns, and that, surprisingly, women coloured more negatively their reactions to the words man, to love and a child (Ms. Goroshko is inclined to attribute this to the present economic situation in Russia). Another test performed by Ms. Goroshko was the so-called "defective text" developed by A.A. Brudny. All participants were distributed with packets of complete sentences, which had been taken from a text and then mixed at random. The task was to reconstruct the original text. There were three types of test, the first descriptive, the second narrative, and the third logical. Ms. Goroshko created computer programmes to analyse the results. She found that none of the reconstructed texts was coincident with the original, differing both from the original text and amongst themselves and that there were many more disparities in the male than the female texts. In the descriptive and logical texts the differences manifested themselves more clearly in the male texts, and in the narrative texts in the female texts. The widest dispersal of values was observed at the outset, while the female text ending was practically coincident with the original (in contrast to the male ending). The greatest differences in text reconstruction for both males and females were registered in the middle of the texts. Women, Ms. Goroshko claims, were more sensitive to the semantic structure of the texts, since they assembled the narrative text much more accurately than the other two, while the men assembled more accurately the logical text. Texts written by women were assembled more accurately by women and texts by men by men. On the basis of computer analysis, Ms. Goroshko found that female speech was substantially more emotional. It was expressed by various means, hyperbole, metaphor, comparisons, epithets, ways of enumeration, and with the aid of interjections, rhetorical questions, exclamations. The level of literacy was higher for female speech, and there were fewer mistakes in grammar and spelling in female texts. The last stage of Ms Goroshko's research concerned the social stereotypes of beliefs about men and women in Russian society today. A large number of respondents were asked questions such as "What merits must a woman possess?", "What are male vices and virtues?", etc. After statistical manipulation, an image of modern man and woman, as it exists in the minds of modern Russian men and women, emerged. Ms. Goroshko believes that her findings are significant not only within the field of linguistics. She has already successfully worked on anonymous texts and been able to decide on the sex of the author and consequently believes that in the future her research may even be of benefit to forensic science.
Resumo:
The aim of this study was to compare the speech in subjects with cleft lip and palate, in whom three methods of the hard palate closure were used. One hundred and thirty-seven children (96 boys, 41 girls; mean age = 12 years, SD = 1·2) with complete unilateral cleft lip and palate (CUCLP) operated by a single surgeon with a one-stage method were evaluated. The management of the cleft lip and soft palate was comparable in all subjects; for hard palate repair, three different methods were used: bilateral von Langenbeck closure (b-vL group, n = 39), unilateral von Langenbeck closure (u-vL group, n = 56) and vomerplasty (v-p group, n = 42). Speech was assessed: (i) perceptually for the presence of a) hypernasality, b) compensatory articulations (CAs), c) audible nasal air emissions (ANE) and d) speech intelligibility; (ii) for the presence of compensatory facial grimacing, (iii) with clinical intra-oral evaluation and (iv) with videonasendoscopy. A total rate of hypernasality requiring pharyngoplasty was 5·1%; total incidence post-oral compensatory articulations (CAs) was 2·2%. The overall speech intelligibility was good in 84·7% of cases. Oronasal fistulas (ONFs) occurred in 15·7% b-vL subjects, 7·1% u-vL subjects and 50% v-p subjects (P < 0·001). No statistically significant intergroup differences for hypernasality, CAs and intelligibility were found (P > 0·1). In conclusion, the speech after early one-stage repair of CUCLP was satisfactory. The method of hard palate repair affected the incidence of ONFs, which, however, caused relatively mild and inconsistent speech errors.
Resumo:
La última década ha sido testigo de importantes avances en el campo de la tecnología de reconocimiento de voz. Los sistemas comerciales existentes actualmente poseen la capacidad de reconocer habla continua de múltiples locutores, consiguiendo valores aceptables de error, y sin la necesidad de realizar procedimientos explícitos de adaptación. A pesar del buen momento que vive esta tecnología, el reconocimiento de voz dista de ser un problema resuelto. La mayoría de estos sistemas de reconocimiento se ajustan a dominios particulares y su eficacia depende de manera significativa, entre otros muchos aspectos, de la similitud que exista entre el modelo de lenguaje utilizado y la tarea específica para la cual se está empleando. Esta dependencia cobra aún más importancia en aquellos escenarios en los cuales las propiedades estadísticas del lenguaje varían a lo largo del tiempo, como por ejemplo, en dominios de aplicación que involucren habla espontánea y múltiples temáticas. En los últimos años se ha evidenciado un constante esfuerzo por mejorar los sistemas de reconocimiento para tales dominios. Esto se ha hecho, entre otros muchos enfoques, a través de técnicas automáticas de adaptación. Estas técnicas son aplicadas a sistemas ya existentes, dado que exportar el sistema a una nueva tarea o dominio puede requerir tiempo a la vez que resultar costoso. Las técnicas de adaptación requieren fuentes adicionales de información, y en este sentido, el lenguaje hablado puede aportar algunas de ellas. El habla no sólo transmite un mensaje, también transmite información acerca del contexto en el cual se desarrolla la comunicación hablada (e.g. acerca del tema sobre el cual se está hablando). Por tanto, cuando nos comunicamos a través del habla, es posible identificar los elementos del lenguaje que caracterizan el contexto, y al mismo tiempo, rastrear los cambios que ocurren en estos elementos a lo largo del tiempo. Esta información podría ser capturada y aprovechada por medio de técnicas de recuperación de información (information retrieval) y de aprendizaje de máquina (machine learning). Esto podría permitirnos, dentro del desarrollo de mejores sistemas automáticos de reconocimiento de voz, mejorar la adaptación de modelos del lenguaje a las condiciones del contexto, y por tanto, robustecer al sistema de reconocimiento en dominios con condiciones variables (tales como variaciones potenciales en el vocabulario, el estilo y la temática). En este sentido, la principal contribución de esta Tesis es la propuesta y evaluación de un marco de contextualización motivado por el análisis temático y basado en la adaptación dinámica y no supervisada de modelos de lenguaje para el robustecimiento de un sistema automático de reconocimiento de voz. Esta adaptación toma como base distintos enfoque de los sistemas mencionados (de recuperación de información y aprendizaje de máquina) mediante los cuales buscamos identificar las temáticas sobre las cuales se está hablando en una grabación de audio. Dicha identificación, por lo tanto, permite realizar una adaptación del modelo de lenguaje de acuerdo a las condiciones del contexto. El marco de contextualización propuesto se puede dividir en dos sistemas principales: un sistema de identificación de temática y un sistema de adaptación dinámica de modelos de lenguaje. Esta Tesis puede describirse en detalle desde la perspectiva de las contribuciones particulares realizadas en cada uno de los campos que componen el marco propuesto: _ En lo referente al sistema de identificación de temática, nos hemos enfocado en aportar mejoras a las técnicas de pre-procesamiento de documentos, asimismo en contribuir a la definición de criterios más robustos para la selección de index-terms. – La eficiencia de los sistemas basados tanto en técnicas de recuperación de información como en técnicas de aprendizaje de máquina, y específicamente de aquellos sistemas que particularizan en la tarea de identificación de temática, depende, en gran medida, de los mecanismos de preprocesamiento que se aplican a los documentos. Entre las múltiples operaciones que hacen parte de un esquema de preprocesamiento, la selección adecuada de los términos de indexado (index-terms) es crucial para establecer relaciones semánticas y conceptuales entre los términos y los documentos. Este proceso también puede verse afectado, o bien por una mala elección de stopwords, o bien por la falta de precisión en la definición de reglas de lematización. En este sentido, en este trabajo comparamos y evaluamos diferentes criterios para el preprocesamiento de los documentos, así como también distintas estrategias para la selección de los index-terms. Esto nos permite no sólo reducir el tamaño de la estructura de indexación, sino también mejorar el proceso de identificación de temática. – Uno de los aspectos más importantes en cuanto al rendimiento de los sistemas de identificación de temática es la asignación de diferentes pesos a los términos de acuerdo a su contribución al contenido del documento. En este trabajo evaluamos y proponemos enfoques alternativos a los esquemas tradicionales de ponderado de términos (tales como tf-idf ) que nos permitan mejorar la especificidad de los términos, así como también discriminar mejor las temáticas de los documentos. _ Respecto a la adaptación dinámica de modelos de lenguaje, hemos dividimos el proceso de contextualización en varios pasos. – Para la generación de modelos de lenguaje basados en temática, proponemos dos tipos de enfoques: un enfoque supervisado y un enfoque no supervisado. En el primero de ellos nos basamos en las etiquetas de temática que originalmente acompañan a los documentos del corpus que empleamos. A partir de estas, agrupamos los documentos que forman parte de la misma temática y generamos modelos de lenguaje a partir de dichos grupos. Sin embargo, uno de los objetivos que se persigue en esta Tesis es evaluar si el uso de estas etiquetas para la generación de modelos es óptimo en términos del rendimiento del reconocedor. Por esta razón, nosotros proponemos un segundo enfoque, un enfoque no supervisado, en el cual el objetivo es agrupar, automáticamente, los documentos en clusters temáticos, basándonos en la similaridad semántica existente entre los documentos. Por medio de enfoques de agrupamiento conseguimos mejorar la cohesión conceptual y semántica en cada uno de los clusters, lo que a su vez nos permitió refinar los modelos de lenguaje basados en temática y mejorar el rendimiento del sistema de reconocimiento. – Desarrollamos diversas estrategias para generar un modelo de lenguaje dependiente del contexto. Nuestro objetivo es que este modelo refleje el contexto semántico del habla, i.e. las temáticas más relevantes que se están discutiendo. Este modelo es generado por medio de la interpolación lineal entre aquellos modelos de lenguaje basados en temática que estén relacionados con las temáticas más relevantes. La estimación de los pesos de interpolación está basada principalmente en el resultado del proceso de identificación de temática. – Finalmente, proponemos una metodología para la adaptación dinámica de un modelo de lenguaje general. El proceso de adaptación tiene en cuenta no sólo al modelo dependiente del contexto sino también a la información entregada por el proceso de identificación de temática. El esquema usado para la adaptación es una interpolación lineal entre el modelo general y el modelo dependiente de contexto. Estudiamos también diferentes enfoques para determinar los pesos de interpolación entre ambos modelos. Una vez definida la base teórica de nuestro marco de contextualización, proponemos su aplicación dentro de un sistema automático de reconocimiento de voz. Para esto, nos enfocamos en dos aspectos: la contextualización de los modelos de lenguaje empleados por el sistema y la incorporación de información semántica en el proceso de adaptación basado en temática. En esta Tesis proponemos un marco experimental basado en una arquitectura de reconocimiento en ‘dos etapas’. En la primera etapa, empleamos sistemas basados en técnicas de recuperación de información y aprendizaje de máquina para identificar las temáticas sobre las cuales se habla en una transcripción de un segmento de audio. Esta transcripción es generada por el sistema de reconocimiento empleando un modelo de lenguaje general. De acuerdo con la relevancia de las temáticas que han sido identificadas, se lleva a cabo la adaptación dinámica del modelo de lenguaje. En la segunda etapa de la arquitectura de reconocimiento, usamos este modelo adaptado para realizar de nuevo el reconocimiento del segmento de audio. Para determinar los beneficios del marco de trabajo propuesto, llevamos a cabo la evaluación de cada uno de los sistemas principales previamente mencionados. Esta evaluación es realizada sobre discursos en el dominio de la política usando la base de datos EPPS (European Parliamentary Plenary Sessions - Sesiones Plenarias del Parlamento Europeo) del proyecto europeo TC-STAR. Analizamos distintas métricas acerca del rendimiento de los sistemas y evaluamos las mejoras propuestas con respecto a los sistemas de referencia. ABSTRACT The last decade has witnessed major advances in speech recognition technology. Today’s commercial systems are able to recognize continuous speech from numerous speakers, with acceptable levels of error and without the need for an explicit adaptation procedure. Despite this progress, speech recognition is far from being a solved problem. Most of these systems are adjusted to a particular domain and their efficacy depends significantly, among many other aspects, on the similarity between the language model used and the task that is being addressed. This dependence is even more important in scenarios where the statistical properties of the language fluctuates throughout the time, for example, in application domains involving spontaneous and multitopic speech. Over the last years there has been an increasing effort in enhancing the speech recognition systems for such domains. This has been done, among other approaches, by means of techniques of automatic adaptation. These techniques are applied to the existing systems, specially since exporting the system to a new task or domain may be both time-consuming and expensive. Adaptation techniques require additional sources of information, and the spoken language could provide some of them. It must be considered that speech not only conveys a message, it also provides information on the context in which the spoken communication takes place (e.g. on the subject on which it is being talked about). Therefore, when we communicate through speech, it could be feasible to identify the elements of the language that characterize the context, and at the same time, to track the changes that occur in those elements over time. This information can be extracted and exploited through techniques of information retrieval and machine learning. This allows us, within the development of more robust speech recognition systems, to enhance the adaptation of language models to the conditions of the context, thus strengthening the recognition system for domains under changing conditions (such as potential variations in vocabulary, style and topic). In this sense, the main contribution of this Thesis is the proposal and evaluation of a framework of topic-motivated contextualization based on the dynamic and non-supervised adaptation of language models for the enhancement of an automatic speech recognition system. This adaptation is based on an combined approach (from the perspective of both information retrieval and machine learning fields) whereby we identify the topics that are being discussed in an audio recording. The topic identification, therefore, enables the system to perform an adaptation of the language model according to the contextual conditions. The proposed framework can be divided in two major systems: a topic identification system and a dynamic language model adaptation system. This Thesis can be outlined from the perspective of the particular contributions made in each of the fields that composes the proposed framework: _ Regarding the topic identification system, we have focused on the enhancement of the document preprocessing techniques in addition to contributing in the definition of more robust criteria for the selection of index-terms. – Within both information retrieval and machine learning based approaches, the efficiency of topic identification systems, depends, to a large extent, on the mechanisms of preprocessing applied to the documents. Among the many operations that encloses the preprocessing procedures, an adequate selection of index-terms is critical to establish conceptual and semantic relationships between terms and documents. This process might also be weakened by a poor choice of stopwords or lack of precision in defining stemming rules. In this regard we compare and evaluate different criteria for preprocessing the documents, as well as for improving the selection of the index-terms. This allows us to not only reduce the size of the indexing structure but also to strengthen the topic identification process. – One of the most crucial aspects, in relation to the performance of topic identification systems, is to assign different weights to different terms depending on their contribution to the content of the document. In this sense we evaluate and propose alternative approaches to traditional weighting schemes (such as tf-idf ) that allow us to improve the specificity of terms, and to better identify the topics that are related to documents. _ Regarding the dynamic language model adaptation, we divide the contextualization process into different steps. – We propose supervised and unsupervised approaches for the generation of topic-based language models. The first of them is intended to generate topic-based language models by grouping the documents, in the training set, according to the original topic labels of the corpus. Nevertheless, a goal of this Thesis is to evaluate whether or not the use of these labels to generate language models is optimal in terms of recognition accuracy. For this reason, we propose a second approach, an unsupervised one, in which the objective is to group the data in the training set into automatic topic clusters based on the semantic similarity between the documents. By means of clustering approaches we expect to obtain a more cohesive association of the documents that are related by similar concepts, thus improving the coverage of the topic-based language models and enhancing the performance of the recognition system. – We develop various strategies in order to create a context-dependent language model. Our aim is that this model reflects the semantic context of the current utterance, i.e. the most relevant topics that are being discussed. This model is generated by means of a linear interpolation between the topic-based language models related to the most relevant topics. The estimation of the interpolation weights is based mainly on the outcome of the topic identification process. – Finally, we propose a methodology for the dynamic adaptation of a background language model. The adaptation process takes into account the context-dependent model as well as the information provided by the topic identification process. The scheme used for the adaptation is a linear interpolation between the background model and the context-dependent one. We also study different approaches to determine the interpolation weights used in this adaptation scheme. Once we defined the basis of our topic-motivated contextualization framework, we propose its application into an automatic speech recognition system. We focus on two aspects: the contextualization of the language models used by the system, and the incorporation of semantic-related information into a topic-based adaptation process. To achieve this, we propose an experimental framework based in ‘a two stages’ recognition architecture. In the first stage of the architecture, Information Retrieval and Machine Learning techniques are used to identify the topics in a transcription of an audio segment. This transcription is generated by the recognition system using a background language model. According to the confidence on the topics that have been identified, the dynamic language model adaptation is carried out. In the second stage of the recognition architecture, an adapted language model is used to re-decode the utterance. To test the benefits of the proposed framework, we carry out the evaluation of each of the major systems aforementioned. The evaluation is conducted on speeches of political domain using the EPPS (European Parliamentary Plenary Sessions) database from the European TC-STAR project. We analyse several performance metrics that allow us to compare the improvements of the proposed systems against the baseline ones.
Resumo:
Purpose: Both phonological (speech) and auditory (non-speech) stimuli have been shown to predict early reading skills. However, previous studies have failed to control for the level of processing required by tasks administered across the two levels of stimuli. For example, phonological tasks typically tap explicit awareness e.g., phoneme deletion, while auditory tasks usually measure implicit awareness e.g., frequency discrimination. Therefore, the stronger predictive power of speech tasks may be due to their higher processing demands, rather than the nature of the stimuli. Method: The present study uses novel tasks that control for level of processing (isolation, repetition and deletion) across speech (phonemes and nonwords) and non-speech (tones) stimuli. 800 beginning readers at the onset of literacy tuition (mean age 4 years and 7 months) were assessed on the above tasks as well as word reading and letter-knowledge in the first part of a three time-point longitudinal study. Results: Time 1 results reveal a significantly higher association between letter-sound knowledge and all of the speech compared to non-speech tasks. Performance was better for phoneme than tone stimuli, and worse for deletion than isolation and repetition across all stimuli. Conclusions: Results are consistent with phonological accounts of reading and suggest that level of processing required by the task is less important than stimuli type in predicting the earliest stage of reading.
Resumo:
The article describes the method of preliminary segmentation of a speech signal with wavelet transformation use, consisting of two stages. At the first stage there is an allocation of sibilants and pauses, at the second – the further segmentation of the rest signal parts.
Resumo:
The morphological and chemical changes occurring during the thermal decomposition of weddelite, CaC2O4·2H2O, have been followed in real time in a heating stage attached to an Environmental Scanning Electron Microscope operating at a pressure of 2 Torr, with a heating rate of 10 °C/min and an equilibration time of approximately 10 min. The dehydration step around 120 °C and the loss of CO around 425 °C do not involve changes in morphology, but changes in the composition were observed. The final reaction of CaCO3 to CaO while evolving CO2 around 600 °C involved the formation of chains of very small oxide particles pseudomorphic to the original oxalate crystals. The change in chemical composition could only be observed after cooling the sample to 350 °C because of the effects of thermal radiation.
Resumo:
Metaphor is a multi-stage programming language extension to an imperative, object-oriented language in the style of C# or Java. This paper discusses some issues we faced when applying multi-stage language design concepts to an imperative base language and run-time environment. The issues range from dealing with pervasive references and open code to garbage collection and implementing cross-stage persistence.