886 resultados para Text-to-speech
Resumo:
A crônica é um gênero textual híbrido, por oscilar entre a subjetividade da literatura e a objetividade do jornalismo. Ela nasce de fatos corriqueiros, está vinculada à fala e à interação e tem um estilo que simula naturalidade. Diante de um texto desse tipo, de que maneira podemos dissociar as marcas de oralidade como constituintes da construção composicional do gênero das marcas de oralidade usadas como recurso estilístico do cronista? Para responder a essa questão, recorremos à Estilística e aos estudos sobre oralidade e escrita, guiando-nos, prioritariamente, pelas ideias de Marcel Cressot, Nilce SantAnna Martins, Norma Discini, José Lemos Monteiro, Claudio Cezar Henriques, Sírio Possenti, Dino Preti, Hudinilson Urbano, Luiz Antônio Marcuschi, Ingedore Villaça Koch e Eni P. Orlandi, entre outros. Trabalhamos, em adição, a linguagem jornalística sob a perspectiva de Patrick Charaudeau. O corpus utilizado para este trabalho teórico foi organizado a partir de crônicas de Joaquim Ferreira dos Santos, cujos textos, em sua maioria, voltam-se para a influência da modalidade falada da língua no cotidiano dos leitores, o que demonstra sua preocupação com a linguagem. Adotamos a linguagem sob a perspectiva sociointeracionista, nos termos de Mikhail Bakhtin, e sustentamos a visão de que as marcas de oralidade nas crônicas de Joaquim configuram-se como recurso estratégico para produzir um efeito de sentido pretendido pelo autor em determinado contexto
Resumo:
In this paper, a Decimative Spectral estimation method based on Eigenanalysis and SVD (Singular Value Decomposition) is presented and applied to speech signals in order to estimate Formant/Bandwidth values. The underlying model decomposes a signal into complex damped sinusoids. The algorithm is applied not only on speech samples but on a small amount of the autocorrelation coefficients of a speech frame as well, for finer estimation. Correct estimation of Formant/Bandwidth values depend on the model order thus, the requested number of poles. Overall, experimentation results indicate that the proposed methodology successfully estimates formant trajectories and their respective bandwidths.
Resumo:
Recently,Handheld Communication Devices is developing very fast, extending in users and spreading in application fields, and has an promising future. This study investigated the acceptance of the multimodal text entry method and the behavioral characteristics when using it. Based on the general information process model of a bimodal system and the human factor studies about the multimodal map system, the present study mainly focused on the hand-speech bimodal text entry method. For acceptance, the study investigated the subjective perception of the accuracy of speech recognition by Wizard of Oz (WOz) experiment and a questionnaire. Results showed that there was a linear relationship between the speech recognition accuracy and the subjective accuracy. Furthermore, as the familiarity increasing, the difference between the acceptable accuracy and the subjective accuracy gradually decreased. In addition, the similarity of meaning between the outcome of speech recognition and the correct sentences was an important referential criterion. The second study investigated three aspects of the bimodal text entry method, including input, error recovery and modal shifts. The first experiment aimed to find the behavioral characteristics of user when doing error recovery task. Results indicated that participants preferred to correct the error by handwriting, which had no relationship with the input modality. The second experiment aimed to discover the behavioral characteristics of users when doing text entry in various types of text. Results showed that users preferred to speech input in both words and sentences conditions, which was highly consistent among individuals, while no significant difference was found between handwriting and speech input in the character condition. Participants used more direct strategy than jumping strategy to deal with mixed text, especially for the Chinese-English mixed type. The third experiment examined the cognitive load in the different modal shifts, results suggesting that there were significant differences between different shifts. Moreover, relevant little time was needed in the Shift from speech input to hand input. Based on the main findings, implications were discussed as follows: Firstly, when evaluating a speech recognition system, attention should be paid to the fact that the speech recognition accuracy was not equal to the subjective accuracy. Secondly, in order to make a speech input system more acceptable, a good method is to train and supply the feedback for the accuracy in training, which improving the familiarity and sensitivity to the system. Thirdly, both the universal and individual behavioral patterns were taken into consideration to improve the error recovery method. Fourthly, easing the study and the use of speech input, the operations of speech input should be simpler. Fifthly, more convenient text input method for non-Chinese text entry should be provided. Finally, the shifting time between hand input and speech input provides an important parameter for the design of automatic-evoked speech recognition system.
Resumo:
This paper provides a summary of our studies on robust speech recognition based on a new statistical approach – the probabilistic union model. We consider speech recognition given that part of the acoustic features may be corrupted by noise. The union model is a method for basing the recognition on the clean part of the features, thereby reducing the effect of the noise on recognition. To this end, the union model is similar to the missing feature method. However, the two methods achieve this end through different routes. The missing feature method usually requires the identity of the noisy data for noise removal, while the union model combines the local features based on the union of random events, to reduce the dependence of the model on information about the noise. We previously investigated the applications of the union model to speech recognition involving unknown partial corruption in frequency band, in time duration, and in feature streams. Additionally, a combination of the union model with conventional noise-reduction techniques was studied, as a means of dealing with a mixture of known or trainable noise and unknown unexpected noise. In this paper, a unified review, in the context of dealing with unknown partial feature corruption, is provided into each of these applications, giving the appropriate theory and implementation algorithms, along with an experimental evaluation.
Resumo:
Reviews the books, Lessons From the Northern Ireland Peace Process edited by Timothy J. White (2013) and Human Rights as War by Other Means by Jennifer Curtis (2014). Edited by a U.S.-based academic with an enduring interest in Ireland, the first book draws together an interdisciplinary group of academics from across North America and the U.K. (though notably not Northern Ireland itself) to cover such topics as third party intervention, nationalism, grassroots change, and community development. The second text to be reviewed may be seen as a thorough analysis of this particular point: what is the role played by human rights in Northern Ireland’s peace process?
Resumo:
This essay aims to confront the literary text Wuthering Heights by Emily Brontë with five of its screen adaptations and Portuguese subtitles. Owing to the scope of the study, it will necessarily afford merely a bird‘s eye view of the issues and serve as a starting point for further research. Accordingly, the following questions are used as guidelines: What transformations occur in the process of adapting the original text to the screen? Do subtitles update the film dialogues to the target audience‘s cultural and linguistic context? Are subtitles influenced more by oral speech than by written literary discourse? Shouldn‘t subtitles in fact reflect the poetic function prevalent in screen adaptations of literary texts? Rather than attempt to answer these questions, we focus on the objects as phenomena. Our interdisciplinary undertaking clearly involves a semio-pragmatic stance, at this stage trying to avoid theoretical backdrops that may affect our apprehension of the objects as to their qualities, singularities, and conventional traits, based on Lucia Santaella‘s interpretation of Charles S. Peirce‘s phaneroscopy. From an empirical standpoint, we gather features and describe peculiarities, under the presumption that there are substrata in subtitling that point or should point to the literary source text, albeit through the mediation of a film script and a particular cinematic style. Therefore, we consider how the subtitling process may be influenced by the literary intertext, the idiosyncrasies of a particular film adaptation, as well as the socio-cultural context of the subtitler and target audience. First, we isolate one of the novel‘s most poignant scenes – ‗I am Heathcliff‘ – taking into account its symbolic play and significance in relation to character and plot construction. Secondly, we study American, English, French, and Mexican adaptations of the excerpt into film in terms of intersemiotic transformations. Then we analyze differences between the film dialogues and their Portuguese subtitles.
Resumo:
Search is now going beyond looking for factual information, and people wish to search for the opinions of others to help them in their own decision-making. Sentiment expressions or opinion expressions are used by users to express their opinion and embody important pieces of information, particularly in online commerce. The main problem that the present dissertation addresses is how to model text to find meaningful words that express a sentiment. In this context, I investigate the viability of automatically generating a sentiment lexicon for opinion retrieval and sentiment classification applications. For this research objective we propose to capture sentiment words that are derived from online users’ reviews. In this approach, we tackle a major challenge in sentiment analysis which is the detection of words that express subjective preference and domain-specific sentiment words such as jargon. To this aim we present a fully generative method that automatically learns a domain-specific lexicon and is fully independent of external sources. Sentiment lexicons can be applied in a broad set of applications, however popular recommendation algorithms have somehow been disconnected from sentiment analysis. Therefore, we present a study that explores the viability of applying sentiment analysis techniques to infer ratings in a recommendation algorithm. Furthermore, entities’ reputation is intrinsically associated with sentiment words that have a positive or negative relation with those entities. Hence, is provided a study that observes the viability of using a domain-specific lexicon to compute entities reputation. Finally, a recommendation system algorithm is improved with the use of sentiment-based ratings and entities reputation.
Resumo:
Dans la foulée des scandales financiers ayant secoué le milieu des affaires ces dernières années, l’efficacité des pratiques de régie d’entreprise, et, en particulier celles liées à l’indépendance des administrateurs, a été passée au crible. L’administrateur désigné par une partie pour la représenter est un type d’administrateur que l’on rencontre fréquemment au sein des conseils d’administration des entreprises. Toutefois, l’on peut se questionner sur l’indépendance réelle de ces administrateurs, considérant leur loyauté envers la personne les ayant désignés, laquelle détient habituellement un intérêt à titre d’actionnaire ou de partie prenante dans l’entreprise visée. En outre, alors que les principes légaux requièrent que les administrateurs agissent dans le meilleur intérêt de l’entreprise, la réalité pratique est parfois toute autre: aux prises avec les instructions ou les souhaits de la personne les ayant nommés, les administrateurs désignés se retrouvent placés en situation inhérente de conflit d’intérêts. Ce texte vise à offrir une analyse détaillée au sujet de l’administrateur désigné et du conflit d’intérêts résultant de cette double exigence de loyauté. L’objectif est de présenter un examen approfondi des diverses difficultés résultant de la nomination d’un administrateur désigné ou associées à celle-ci, ainsi que des réponses judiciaires et législatives liées à cette problématique. Cette réflexion mènera à une exploration de certains systèmes législatifs et légaux, en particulier ceux du Royaume-Uni, de l’Australie et de la Nouvelle-Zélande, afin d’obtenir une meilleure compréhension et d’offrir une perspective éclairée quant aux enjeux analysés par la présente.
Resumo:
For this paper, heterolingualism or language plurality will be considered as the presence in a single text or in a social environment of both French and English, Canada’s official languages. Language plurality will here be studied from an institutional viewpoint: the influence of the Canadian government on the translation of political speeches. The first part of this article will establish that political speeches are written in a bilingual environment where the two official languages are often in contact. This bilingualism, however, is often homogenised when it comes to speech delivery and publication. Therefore, the second part focuses on the speeches’ paratextual
Resumo:
In this paper, we describe an interdisciplinary project in which visualization techniques were developed for and applied to scholarly work from literary studies. The aim was to bring Christof Schöch's electronic edition of Bérardier de Bataut's Essai sur le récit (1776) to the web. This edition is based on the Text Encoding Initiative's XML-based encoding scheme (TEI P5, subset TEI-Lite). This now de facto standard applies to machine-readable texts used chiefly in the humanities and social sciences. The intention of this edition is to make the edited text freely available on the web, to allow for alternative text views (here original and modern/corrected text), to ensure reader-friendly annotation and navigation, to permit on-line collaboration in encoding and annotation as well as user comments, all in an open source, generically usable, lightweight package. These aims were attained by relying on a GPL-based, public domain CMS (Drupal) and combining it with XSL-Stylesheets and Java Script.
Resumo:
Este trabajo de grado estudia el modo como dos diarios colombianos, El Tiempo y El Espectador, presentaron la información sobre el debate de “Leyes de medios” en Ecuador y Argentina, desarrollado en los años recientes. El estudio analizó las noticias, columnas, editoriales, reportajes y entrevistas publicados desde el año 2009 (cuando se generalizó el debate) hasta 2013. La metodología facilitó el análisis comparativo de los dos periódicos y de sus posturas frente al debate de las leyes de medios. Se argumenta que los medios, como actores principales del debate, han tomado partido, en sus páginas editoriales e informativas, en contra de las políticas que buscan regular su actividad. Así, existe una persistencia en la prensa para imponer discursos con una ideología específica a través de los actores que son presentados en la construcción de los relatos, en este caso, dispuestos a contrariar la aprobación de las leyes de medios propuestas por el gobierno argentino y ecuatoriano. Se evidencia en el marco interpretativo de los textos periodísticos analizados, la guerra de poderes en países de la región latinoamericana que persiste entre los medios y los gobiernos.
Resumo:
It has been previously demonstrated that extensive activation in the dorsolateral temporal lobes associated with masking a speech target with a speech masker, consistent with the hypothesis that competition for central auditory processes is an important factor in informational masking. Here, masking from speech and two additional maskers derived from the original speech were investigated. One of these is spectrally rotated speech, which is unintelligible and has a similar (inverted) spectrotemporal profile to speech. The authors also controlled for the possibility of “glimpsing” of the target signal during modulated masking sounds by using speech-modulated noise as a masker in a baseline condition. Functional imaging results reveal that masking speech with speech leads to bilateral superior temporal gyrus (STG) activation relative to a speech-in-noise baseline, while masking speech with spectrally rotated speech leads solely to right STG activation relative to the baseline. This result is discussed in terms of hemispheric asymmetries for speech perception, and interpreted as showing that masking effects can arise through two parallel neural systems, in the left and right temporal lobes. This has implications for the competition for resources caused by speech and rotated speech maskers, and may illuminate some of the mechanisms involved in informational masking.
Resumo:
It has been previously demonstrated that extensive activation in the dorsolateral temporal lobes associated with masking a speech target with a speech masker, consistent with the hypothesis that competition for central auditory processes is an important factor in informational masking. Here, masking from speech and two additional maskers derived from the original speech were investigated. One of these is spectrally rotated speech, which is unintelligible and has a similar (inverted) spectrotemporal profile to speech. The authors also controlled for the possibility of "glimpsing" of the target signal during modulated masking sounds by using speech-modulated noise as a masker in a baseline condition. Functional imaging results reveal that masking speech with speech leads to bilateral superior temporal gyrus (STG) activation relative to a speech-in-noise baseline, while masking speech with spectrally rotated speech leads solely to right STG activation relative to the baseline. This result is discussed in terms of hemispheric asymmetries for speech perception, and interpreted as showing that masking effects can arise through two parallel neural systems, in the left and right temporal lobes. This has implications for the competition for resources caused by speech and rotated speech maskers, and may illuminate some of the mechanisms involved in informational masking.
Resumo:
Background Evidence suggests a reversal of the normal left-lateralised response to speech in schizophrenia. Aims To test the brain's response to emotional prosody in schizophrenia and bipolar disorder. Method BOLD contrast functional magnetic resonance imaging of subjects while they passively listened or attended to sentences that differed in emotional prosody Results Patients with schizophrenia exhibited normal right-lateralisation of the passive response to 'pure' emotional prosody and relative left-lateralisation of the response to unfiltered emotional prosody When attending to emotional prosody, patients with schizophrenia activated the left insula more than healthy controls. When listening passively, patients with bipolar disorder demonstrated less activation of the bilateral superior temporal gyri in response to pure emotional prosody, and greater activation of the left superior temporal gyrus in response to unfiltered emotional prosody In both passive experiments, the patient groups activated different lateral temporal lobe regions. Conclusions Patients with schizophrenia and bipolar disorder may display some left-lateralisation of the normal right-lateralised temporal lobe response to emotional prosody. Declaration of interest R.M. received a studentship from Neuraxis,, and funding from the Neuroscience and Psychiatry Unit, University of Manchester.