889 resultados para Corpus bruit


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present the process of designing an efficient speech corpus for the first unit selection speech synthesis system for Bulgarian, along with some significant preliminary results regarding the quality of the resulted system. As the initial corpus is a crucial factor for the quality delivered by the Text-to-Speech system, special effort has been given in designing a complete and efficient corpus for use in a unit selection TTS system. The targeted domain of the TTS system and hence that of the corpus is the news reports, and although it is a restricted one, it is characterized by an unlimited vocabulary. The paper focuses on issues regarding the design of an optimal corpus for such a framework and the ideas on which our approach was based on. A novel multi-stage approach is presented, with special attention given to language and speaker dependent issues, as they affect the entire process. The paper concludes with the presentation of our results and the evaluation experiments, which provide clear evidence of the quality level achieved. © 2011 Springer-Verlag.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Raybould, M. and Sims-Williams, P. (2007). A Corpus of Latin Inscriptions of the Roman Empire containing Celtic personal names. Aberystwyth: CMCS publications. RAE2008

Relevância:

20.00% 20.00%

Publicador:

Resumo:

P.M. Hastie and W. Haresign (2006). A role for LH in the regulation of expression of mRNAs encoding components of the insulin-like growth factor (IGF) system in the ovine corpus luteum. Animal Reproduction Science, 96(1-2), 196-209. Sponsorship: DEFRA RAE2008

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Leaving Certificate (LC) is the national, standardised state examination in Ireland necessary for entry to third level education – this presents a massive, raw corpus of data with the potential to yield invaluable insight into the phenomena of learner interlanguage. With samples of official LC Spanish examination data, this project has compiled a digitised corpus of learner Spanish comprised of the written and oral production of 100 candidates. This corpus was then analysed using a specific investigative corpus technique, Computer-aided Error Analysis (CEA, Dagneaux et al, 1998). CEA is a powerful apparatus in that it greatly facilitates the quantification and analysis of a large learner corpus in digital format. The corpus was both compiled and analysed with the use of UAM Corpus Tool (O’Donnell 2013). This Tool allows for the recording of candidate-specific variables such as grade, examination level, task type and gender, therefore allowing for critical analysis of the corpus as one unit, as separate written and oral sub corpora and also of performance per task, level and gender. This is an interdisciplinary work combining aspects of Applied Linguistics, Learner Corpus Research and Foreign Language (FL) Learning. Beginning with a review of the context of FL learning in Ireland and Europe, I go on to discuss the disciplinary context and theoretical framework for this work and outline the methodology applied. I then perform detailed quantitative and qualitative analyses before going on to combine all research findings outlining principal conclusions. This investigation does not make a priori assumptions about the data set, the LC Spanish examination, the context of FLs or of any aspect of learner competence. It undertakes to provide the linguistic research community and the domain of Spanish language learning and pedagogy in Ireland with an empirical, descriptive profile of real learner performance, characterising learner difficulty.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective Describe the methodology and selection of quality indicators (QI) to be implemented in the EFFECT (EFFectiveness of Endometrial Cancer Treatment) project. EFFECT aims to monitor the variability in Quality of Care (QoC) of uterine cancer in Belgium, to compare the effectiveness of different treatment strategies to improve the QoC and to check the internal validity of the QI to validate the impact of process indicators on outcome. Methods A QI list was retrieved from literature, recent guidelines and QI databases. The Belgian Healthcare Knowledge Center methodology was used for the selection process and involved an expert's panel rating the QI on 4 criteria. The resulting scores and further discussion resulted in a final QI list. An online EFFECT module was developed by the Belgian Cancer Registry including the list of variables required for measuring the QI. Three test phases were performed to evaluate the relevance, feasibility and understanding of the variables and to test the compatibility of the dataset. Results 138 QI were considered for further discussion and 82 QI were eligible for rating. Based on the rating scores and consensus among the expert's panel, 41 QI were considered measurable and relevant. Testing of the data collection enabled optimization of the content and the user-friendliness of the dataset and online module. Conclusions This first Belgian initiative for monitoring the QoC of uterine cancer indicates that the previously used QI selection methodology is reproducible for uterine cancer. The QI list could be applied by other research groups for comparison. © 2013 Elsevier Inc.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

L'article présente quelques éléments de la procédure mise en place pour traiter un corpus écrit comportant 617 textes (près de 500 000 mots) relatifs aux eurorégions. Complexe et hétérogène à plusieurs titres (technique, linguistique, éditorial, générique, énonciatif), le corpus pose la difficulté majeure de l’appréhension de données multilingues (français, italien, espagnol, anglais, allemand, néerlandais). Sa manipulation a nécessité une réflexion adaptée et une démarche de modélisation que nous qualifions d’« agile » en raison de son caractère souple et itératif. La plateforme d’analyse élaborée permet de disposer de résultats utiles à l’analyse qualitative ultérieure du discours eurorégional. Elle articule un logiciel d'analyse morphosyntaxique éprouvé (TreeTagger) à des programmes (Perl) et à une base de données (SQLite) développés pour optimiser les requêtes multilingues simultanées et l’exportation automatique des résultats. Les fonctionnalités liées à la localisation contextualisée de mots- pivots, au recueil de dénominations et à la détection de segments répétés nous servent ici de guides pour exprimer les besoins de la recherche, les problèmes rencontrés et les solutions proposées. L'analyse d'observables récurrents, à savoir les notions de décision et de responsabilité, illustre le propos.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

L’objectif de cet article est double :il s’agit, d’une part, de présenter un nouveau corpus permettant d’envisager le phénomène émergent de la communication transfrontalière en Europe et, d’autre part, de formuler trois questionnements utiles au cadrage de son analyse sémantique. À partir du corpus eurorégional – multilingue et multigenre - nous posons les questions de la dispersion des discours en ligne, de l’hétérogénéité des données et de la contextualisation de l’analyse. Notre démarche consiste à construire progressivement un modèle d’analyse adapté à l’appréhension, tantôt automatique et tantôt manuelle, de la diversité des textes. Enfin, nous proposons d’illustrer la démarche en l’appliquant à la mobilité, observable récurrent du corpus.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The objective of this paper is to describe and evaluate the application of the Text Encoding Initiative (TEI) Guidelines to a corpus of oral French, this being the first corpus of oral French where the TEI has been used. The paper explains the purpose of the corpus, both in creating a specialist corpus of néo-contage that will broaden the range of oral corpora available, and, more importantly, in creating a dataset to explore a variety of oral French that has a particularly interesting status in terms of factors such as conception orale/écrite, réalisation médiale and comportement communicatif (Koch and Oesterreicher 2001). The linguistic phenomena to be encoded are both stylistic (speech and thought presentation) and syntactic (negation, detachment, inversion), and all represent areas where previous research has highlighted the significance of factors such as medium, register and discourse type, as well as a host of linguistic factors (syntactic, phonetic, lexical). After a discussion of how a tagset can be designed and applied within the TEI to encode speech and thought presentation, negation, detachment and inversion, the final section of the paper evaluates the benefits and possible drawbacks of the methodology offered by the TEI when applied to a syntactic and stylistic markup of an oral corpus.

Relevância:

20.00% 20.00%

Publicador: