8 resultados para text-to-grammar
em Instituto Politécnico do Porto, Portugal
Resumo:
In this paper, a rule-based automatic syllabifier for Danish is described using the Maximal Onset Principle. Prior success rates of rule-based methods applied to Portuguese and Catalan syllabification modules were on the basis of this work. The system was implemented and tested using a very small set of rules. The results gave rise to 96.9% and 98.7% of word accuracy rate, contrary to our initial expectations, being Danish a language with a complex syllabic structure and thus difficult to be rule-driven. Comparison with data-driven syllabification system using artificial neural networks showed a higher accuracy rate of the former system.
Resumo:
In this paper, a linguistically rule-based grapheme-to-phone (G2P) transcription algorithm is described for European Portuguese. A complete set of phonological and phonetic transcription rules regarding the European Portuguese standard variety is presented. This algorithm was implemented and tested by using online newspaper articles. The obtained experimental results gave rise to 98.80% of accuracy rate. Future developments in order to increase this value are foreseen. Our purpose with this work is to develop a module/ tool that can improve synthetic speech naturalness in European Portuguese. Other applications of this system can be expected like language teaching/learning. These results, together with our perspectives of future improvements, have proved the dramatic importance of linguistic knowledge on the development of Text-to-Speech systems (TTS).
Resumo:
In the last few years, the number of systems and devices that use voice based interaction has grown significantly. For a continued use of these systems, the interface must be reliable and pleasant in order to provide an optimal user experience. However there are currently very few studies that try to evaluate how pleasant is a voice from a perceptual point of view when the final application is a speech based interface. In this paper we present an objective definition for voice pleasantness based on the composition of a representative feature subset and a new automatic voice pleasantness classification and intensity estimation system. Our study is based on a database composed by European Portuguese female voices but the methodology can be extended to male voices or to other languages. In the objective performance evaluation the system achieved a 9.1% error rate for voice pleasantness classification and a 15.7% error rate for voice pleasantness intensity estimation.
Resumo:
This essay aims to confront the literary text Wuthering Heights by Emily Brontë with five of its screen adaptations and Portuguese subtitles. Owing to the scope of the study, it will necessarily afford merely a bird‘s eye view of the issues and serve as a starting point for further research. Accordingly, the following questions are used as guidelines: What transformations occur in the process of adapting the original text to the screen? Do subtitles update the film dialogues to the target audience‘s cultural and linguistic context? Are subtitles influenced more by oral speech than by written literary discourse? Shouldn‘t subtitles in fact reflect the poetic function prevalent in screen adaptations of literary texts? Rather than attempt to answer these questions, we focus on the objects as phenomena. Our interdisciplinary undertaking clearly involves a semio-pragmatic stance, at this stage trying to avoid theoretical backdrops that may affect our apprehension of the objects as to their qualities, singularities, and conventional traits, based on Lucia Santaella‘s interpretation of Charles S. Peirce‘s phaneroscopy. From an empirical standpoint, we gather features and describe peculiarities, under the presumption that there are substrata in subtitling that point or should point to the literary source text, albeit through the mediation of a film script and a particular cinematic style. Therefore, we consider how the subtitling process may be influenced by the literary intertext, the idiosyncrasies of a particular film adaptation, as well as the socio-cultural context of the subtitler and target audience. First, we isolate one of the novel‘s most poignant scenes – ‗I am Heathcliff‘ – taking into account its symbolic play and significance in relation to character and plot construction. Secondly, we study American, English, French, and Mexican adaptations of the excerpt into film in terms of intersemiotic transformations. Then we analyze differences between the film dialogues and their Portuguese subtitles.
Resumo:
The aim of this paper is to present the main Portuguese results from a multi-national study on reading format preferences and behaviors from undergraduate students from Polytechnic Institute of Porto (Portugal). For this purpose we apply an adaptation of the Academic Reading Questionnaire previously created by Mizrachi (2014). This survey instrument has 14 Likert-style statements regarding the format influence in the students reading behavior, including aspects such as ability to remember, feelings about access convenience, active engagement with the text by highlighting and annotating, and ability to review and concentrate on the text. The importance of the language and dimension of the text to determine the preference format is also inquired. Students are also asked about the electronic device they use to read digital documents. Finally, some demographic and academic data were gathered. The analysis of the results will be contextualized on a review of the literature concerning youngsters reading format preferences. The format (digital or print) in which a text is displayed and read can impact comprehension, which is an important information literacy skill. This is a quite relevant issue for class readings in academic context because it impacts learning. On the other hand, students preferences on reading formats will influence the use of library services. However, literature is not unanimous on this subject. Woody, Daniel and Baker (2010) concluded that the experience of reading is not the same in electronic or print context and that students prefer print books than e-books. This thesis is reinforced by Ji, Michaels and Waterman (2014) which report that among 101 undergraduates the large majority self-reported to read and learn more when they use printed format despite the fact that they prefer electronically supplied readings instead of those supplied in printed form. On the other side, Rockinson-Szapkiw, et al (2013) conducted a study were they demonstrate that e-textbook is as effective for learning as the traditional textbook and that students who choose e-textbook had significantly higher perceived learning than students who chose to use print textbooks.
Resumo:
In this paper, a module for homograph disambiguation in Portuguese Text-to-Speech (TTS) is proposed. This module works with a part-of-speech (POS) parser, used to disambiguate homographs that belong to different parts-of-speech, and a semantic analyzer, used to disambiguate homographs which belong to the same part-of-speech. The proposed algorithms are meant to solve a significant part of homograph ambiguity in European Portuguese (EP) (106 homograph pairs so far). This system is ready to be integrated in a Letter-to-Sound (LTS) converter. The algorithms were trained and tested with different corpora. The obtained experimental results gave rise to 97.8% of accuracy rate. This methodology is also valid for Brazilian Portuguese (BP), since 95 homographs pairs are exactly the same as in EP. A comparison with a probabilistic approach was also done and results were discussed.
Resumo:
For some years now, translation theorist and educator Anthony Pym has been trying to establish a dialogue between the academic tradition he comes from and the world of the language industries into which he is meant to introduce his students: in other words, between the Translation Studies discipline and the localisation sector. This rapprochement is also the stated aim of his new book The Moving Text (p. 159). Rather than collect and synthesise what was previously dispersed over several articles, Pym has rewritten his material completely, both literally and conceptually, all in the light of the more than three decades of research he has conducted into the field of cross--cultural communication. The theoretical arguments are ably supported by a few short but telling and well-exploited examples.
Resumo:
Text file evaluation is an emergent topic in e-learning that responds to the shortcomings of the assessment based on questions with predefined answers. Questions with predefined answers are formalized in languages such as IMS Question & Test Interoperability Specification (QTI) and supported by many e-learning systems. Complex evaluation domains justify the development of specialized evaluators that participate in several business processes. The goal of this paper is to formalize the concept of a text file evaluation in the scope of the E-Framework – a service oriented framework for development of e-learning systems maintained by a community of practice. The contribution includes an abstract service type and a service usage model. The former describes the generic capabilities of a text file evaluation service. The later is a business process involving a set of services such as repositories of learning objects and learning management systems.