Biblioteca Digital

874 resultados para corpus luteum

Corpus design for a unit selection TtS system with application to Bulgarian

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present the process of designing an efficient speech corpus for the first unit selection speech synthesis system for Bulgarian, along with some significant preliminary results regarding the quality of the resulted system. As the initial corpus is a crucial factor for the quality delivered by the Text-to-Speech system, special effort has been given in designing a complete and efficient corpus for use in a unit selection TTS system. The targeted domain of the TTS system and hence that of the corpus is the news reports, and although it is a restricted one, it is characterized by an unlimited vocabulary. The paper focuses on issues regarding the design of an optimal corpus for such a framework and the ideas on which our approach was based on. A novel multi-stage approach is presented, with special attention given to language and speaker dependent issues, as they affect the entire process. The paper concludes with the presentation of our results and the evaluation experiments, which provide clear evidence of the quality level achieved. © 2011 Springer-Verlag.

A Corpus of Latin Inscriptions of the Roman Empire containing Celtic personal names

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Raybould, M. and Sims-Williams, P. (2007). A Corpus of Latin Inscriptions of the Roman Empire containing Celtic personal names. Aberystwyth: CMCS publications. RAE2008

A corpus-driven error analysis of the oral and written production of Leaving Certificate Spanish learners

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Leaving Certificate (LC) is the national, standardised state examination in Ireland necessary for entry to third level education – this presents a massive, raw corpus of data with the potential to yield invaluable insight into the phenomena of learner interlanguage. With samples of official LC Spanish examination data, this project has compiled a digitised corpus of learner Spanish comprised of the written and oral production of 100 candidates. This corpus was then analysed using a specific investigative corpus technique, Computer-aided Error Analysis (CEA, Dagneaux et al, 1998). CEA is a powerful apparatus in that it greatly facilitates the quantification and analysis of a large learner corpus in digital format. The corpus was both compiled and analysed with the use of UAM Corpus Tool (O’Donnell 2013). This Tool allows for the recording of candidate-specific variables such as grade, examination level, task type and gender, therefore allowing for critical analysis of the corpus as one unit, as separate written and oral sub corpora and also of performance per task, level and gender. This is an interdisciplinary work combining aspects of Applied Linguistics, Learner Corpus Research and Foreign Language (FL) Learning. Beginning with a review of the context of FL learning in Ireland and Europe, I go on to discuss the disciplinary context and theoretical framework for this work and outline the methodology applied. I then perform detailed quantitative and qualitative analyses before going on to combine all research findings outlining principal conclusions. This investigation does not make a priori assumptions about the data set, the LC Spanish examination, the context of FLs or of any aspect of learner competence. It undertakes to provide the linguistic research community and the domain of Spanish language learning and pedagogy in Ireland with an empirical, descriptive profile of real learner performance, characterising learner difficulty.

Evaluation of the quality of the management of cancer of the corpus uteri-Selection of relevant quality indicators and implementation in Belgium

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective Describe the methodology and selection of quality indicators (QI) to be implemented in the EFFECT (EFFectiveness of Endometrial Cancer Treatment) project. EFFECT aims to monitor the variability in Quality of Care (QoC) of uterine cancer in Belgium, to compare the effectiveness of different treatment strategies to improve the QoC and to check the internal validity of the QI to validate the impact of process indicators on outcome. Methods A QI list was retrieved from literature, recent guidelines and QI databases. The Belgian Healthcare Knowledge Center methodology was used for the selection process and involved an expert's panel rating the QI on 4 criteria. The resulting scores and further discussion resulted in a final QI list. An online EFFECT module was developed by the Belgian Cancer Registry including the list of variables required for measuring the QI. Three test phases were performed to evaluate the relevance, feasibility and understanding of the variables and to test the compatibility of the dataset. Results 138 QI were considered for further discussion and 82 QI were eligible for rating. Based on the rating scores and consensus among the expert's panel, 41 QI were considered measurable and relevant. Testing of the data collection enabled optimization of the content and the user-friendliness of the dataset and online module. Conclusions This first Belgian initiative for monitoring the QoC of uterine cancer indicates that the previously used QI selection methodology is reproducible for uterine cancer. The QI list could be applied by other research groups for comparison. © 2013 Elsevier Inc.

Traitement de données issues d'un corpus écrit multilingue. Approche agile pour l'analyse du discours eurorégional

Relevância:

20.00% 20.00%

Publicador:

Resumo:

L'article présente quelques éléments de la procédure mise en place pour traiter un corpus écrit comportant 617 textes (près de 500 000 mots) relatifs aux eurorégions. Complexe et hétérogène à plusieurs titres (technique, linguistique, éditorial, générique, énonciatif), le corpus pose la difficulté majeure de l’appréhension de données multilingues (français, italien, espagnol, anglais, allemand, néerlandais). Sa manipulation a nécessité une réflexion adaptée et une démarche de modélisation que nous qualifions d’« agile » en raison de son caractère souple et itératif. La plateforme d’analyse élaborée permet de disposer de résultats utiles à l’analyse qualitative ultérieure du discours eurorégional. Elle articule un logiciel d'analyse morphosyntaxique éprouvé (TreeTagger) à des programmes (Perl) et à une base de données (SQLite) développés pour optimiser les requêtes multilingues simultanées et l’exportation automatique des résultats. Les fonctionnalités liées à la localisation contextualisée de mots- pivots, au recueil de dénominations et à la détection de segments répétés nous servent ici de guides pour exprimer les besoins de la recherche, les problèmes rencontrés et les solutions proposées. L'analyse d'observables récurrents, à savoir les notions de décision et de responsabilité, illustre le propos.

La formation discursive eurorégionale. Articulation et approche sémantique d’un corpus multilingue

Relevância:

20.00% 20.00%

Publicador:

Resumo:

L’objectif de cet article est double :il s’agit, d’une part, de présenter un nouveau corpus permettant d’envisager le phénomène émergent de la communication transfrontalière en Europe et, d’autre part, de formuler trois questionnements utiles au cadrage de son analyse sémantique. À partir du corpus eurorégional – multilingue et multigenre - nous posons les questions de la dispersion des discours en ligne, de l’hétérogénéité des données et de la contextualisation de l’analyse. Notre démarche consiste à construire progressivement un modèle d’analyse adapté à l’appréhension, tantôt automatique et tantôt manuelle, de la diversité des textes. Enfin, nous proposons d’illustrer la démarche en l’appliquant à la mobilité, observable récurrent du corpus.

'Aspects of the Verb Phrase in Standard Irish English: A Corpus-based Approach'

Relevância:

20.00% 20.00%

Publicador:

�Teaching Critical Skills in Corpus Linguistics Using the BNC�

Relevância:

20.00% 20.00%

Publicador:

The International Corpus of English: The Irish Component (ICE-Ireland) v. 1.2.2

Relevância:

20.00% 20.00%

Publicador:

LI-BEL CASE: a corpus of spoken academic English

Relevância:

20.00% 20.00%

Publicador:

Annotating an Oral Corpus using the Text Encoding Initiative. Methodology, Problems, Solutions

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The objective of this paper is to describe and evaluate the application of the Text Encoding Initiative (TEI) Guidelines to a corpus of oral French, this being the first corpus of oral French where the TEI has been used. The paper explains the purpose of the corpus, both in creating a specialist corpus of néo-contage that will broaden the range of oral corpora available, and, more importantly, in creating a dataset to explore a variety of oral French that has a particularly interesting status in terms of factors such as conception orale/écrite, réalisation médiale and comportement communicatif (Koch and Oesterreicher 2001). The linguistic phenomena to be encoded are both stylistic (speech and thought presentation) and syntactic (negation, detachment, inversion), and all represent areas where previous research has highlighted the significance of factors such as medium, register and discourse type, as well as a host of linguistic factors (syntactic, phonetic, lexical). After a discussion of how a tagset can be designed and applied within the TEI to encode speech and thought presentation, negation, detachment and inversion, the final section of the paper evaluates the benefits and possible drawbacks of the methodology offered by the TEI when applied to a syntactic and stylistic markup of an oral corpus.

Assessing Celticity in a Corpus of Irish Standard English

Relevância:

20.00% 20.00%

Publicador:

The SPICE-Ireland Corpus, v. 1.2.2

Relevância:

20.00% 20.00%

Publicador:

The Corpus Martianus Capella: Continental Gloss Traditions on 'De Nuptiis' in Wales and Anglo-Saxon England

Relevância:

20.00% 20.00%

Publicador:

Creating a spontaneous conversational speech corpus

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Speech recognition and language analysis of spontaneous speech arising in naturally spoken conversations are becoming the subject of much research. However, there is a shortage of spontaneous speech corpora that are freely available for academics. We therefore undertook the building of a natural conversation speech database, recording over 200 hours of conversations in English by over 600 local university students. With few exceptions, the students used their own cell phones from their own rooms or homes to speak to one another, and they were permitted to speak on any topic they chose. Although they knew that they were being recorded and that they would receive a small payment, their conversations in the corpus are probably very close to being natural and spontaneous. This paper describes a detailed case study of the problems we faced and the methods we used to make the recordings and control the collection of these social science data on a limited budget.

«
1
2
...
12
13
14
15
16
17
18
...
58
59
»