967 resultados para Comparable corpora
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
The main purpose of this investigation is to analyze the most frequent simple terms, fixed and semifixed expressions in the subarea of Social Political Economy in Portuguese and their corresponding terms in English, found in fifteen papers written by Bresser-Pereira and in his self-translated texts. The methodology used is the Corpus-Based Translation (Baker, 1992, 1993, 1995, 1996; Camargo, 2005, 2007), Corpus Linguistics (Berber Sardinha, 2004) and Terminology (Barros, 2004). Results show that terms and expressions used in the source texts have no univocity within the specialized language related to the Brazilian Social Sciences. The terms translated into English also reflect variation due to the options chosen by the selftranslator as he seeks to adapt the theoretical concepts to the possibilities of the Target Language.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
This study aims at discussing aspects related to learner corpora and linguistic features found in texts written by English learners based on the use of collocations in text production. For this research, we analyzed collocations with the verb “to have” and with the nouns “prejudice” and “regret”.
Resumo:
This papers deals with theoretical and methodological problems concerning the definition of criteria for the selection of sources for the study of language. Our work discussed the use and the importance of genre in the process of corpora construction as variation search – synchronic and diachronic.
Resumo:
This article presents an investigation of four linguistic phenomena of the Portuguese in different synchronicities. These research are about construction, description and analysis of corpora for the creation of databases that support linguistic description.
Resumo:
To examine whether the widely used Strengths and Difficulties Questionnaire (SDQ) can validly be used to compare the prevalence of child mental health problems cross nationally. We used data on 29,225 5- to 16-year olds in eight population-based studies from seven countries: Bangladesh, Brazil, Britain, India, Norway, Russia and Yemen. Parents completed the SDQ in all eight studies, teachers in seven studies and youth in five studies. We used these SDQ data to calculate three different sorts of "caseness indicators" based on (1) SDQ symptoms, (2) SDQ symptoms plus impact and (3) an overall respondent judgement of 'definite' or 'severe' difficulties. Respondents also completed structured diagnostic interviews including extensive open-ended questions (the Development and Well-Being Assessment, DAWBA). Diagnostic ratings were all carried out or supervised by the DAWBA's creator, working in conjunction with experienced local professionals. As judged by the DAWBA, the prevalence of any mental disorder ranged from 2.2% in India to 17.1% in Russia. The nine SDQ caseness indicators (three indicators times three informants) explained 8-56% of the cross-national variation in disorder prevalence. This was insufficient to make meaningful prevalence estimates since populations with a similar measured prevalence of disorder on the DAWBA showed large variations across the various SDQ caseness indicators. The relationship between SDQ caseness indicators and disorder rates varies substantially between populations: cross-national differences in SDQ indicators do not necessarily reflect comparable differences in disorder rates. More generally, considerable caution is required when interpreting cross-cultural comparisons of mental health, particularly when these rely on brief questionnaires.
Resumo:
[EN]Qualitative and quantitative research approaches are often considered as incompatible, and when they are brought together in a study, the analyses often stay within the realm of the same research field. The study at hand aims at combining the two methods from the perspectives of different disciplines and tries to determine to which degree a corpus-based analysis might support the traditional content-focused approach to qualitative data and render additional results.
Resumo:
This study aims to the elaboration of juridical and administrative terminology in Ladin language, actually on the Ladin idiom spoken in Val Badia. The necessity of this study is strictly connected to the fact that in South Tyrol the Ladin language is not just safeguarded, but the editing of administrative and normative text is guaranteed by law. This means that there is a need for a unique terminology in order to support translators and editors of specialised texts. The starting point of this study are, on one side the need of a unique terminology, and on the other side the translation work done till now from the employees of the public administration in Ladin language. In order to document their efforts a corpus made up of digitalized administrative and normative documents was build. The first two chapters focuses on the state of the art of projects on terminology and corpus linguistics for lesser used languages. The information were collected thanks to the help of institutes, universities and researchers dealing with lesser used languages. The third chapter focuses on the development of administrative language in Ladin language and the fourth chapter focuses on the creation of the trilingual Italian – German – Ladin corpus made up of administrative and normative documents. The last chapter deals with the methodologies applied in order to elaborate the terminology entries in Ladin language though the use of the trilingual corpus. Starting from the terminology entry all steps are described, from term extraction, to the extraction of equivalents, contexts and definitions and of course also of the elaboration of translation proposals for not found equivalences. Finally the problems referring to the elaboration of terminology in Ladin language are illustrated.
Resumo:
[EN]This paper proposes an alternative bibliometric indicator for evaluating scholarly journals based on the percentage of highly cited articles in a journal. It compares such an index with the impact factor and the h-index by using different time windows and levels of citation that can determine when a document can be considered as highly cited compared to others of the same year and discipline. The main outcome of this comparison suggests that the best index for obtaining data distributions that are comparable between scientific fields is by taking the 20% citation percentile over a three-year time frame for considering citations.
Resumo:
The construction and use of multimedia corpora has been advocated for a while in the literature as one of the expected future application fields of Corpus Linguistics. This research project represents a pioneering experience aimed at applying a data-driven methodology to the study of the field of AVT, similarly to what has been done in the last few decades in the macro-field of Translation Studies. This research was based on the experience of Forlixt 1, the Forlì Corpus of Screen Translation, developed at the University of Bologna’s Department of Interdisciplinary Studies in Translation, Languages and Culture. As a matter of fact, in order to quantify strategies of linguistic transfer of an AV product, we need to take into consideration not only the linguistic aspect of such a product but all the meaning-making resources deployed in the filmic text. Provided that one major benefit of Forlixt 1 is the combination of audiovisual and textual data, this corpus allows the user to access primary data for scientific investigation, and thus no longer rely on pre-processed material such as traditional annotated transcriptions. Based on this rationale, the first chapter of the thesis sets out to illustrate the state of the art of research in the disciplinary fields involved. The primary objective was to underline the main repercussions on multimedia texts resulting from the interaction of a double support, audio and video, and, accordingly, on procedures, means, and methods adopted in their translation. By drawing on previous research in semiotics and film studies, the relevant codes at work in visual and acoustic channels were outlined. Subsequently, we concentrated on the analysis of the verbal component and on the peculiar characteristics of filmic orality as opposed to spontaneous dialogic production. In the second part, an overview of the main AVT modalities was presented (dubbing, voice-over, interlinguistic and intra-linguistic subtitling, audio-description, etc.) in order to define the different technologies, processes and professional qualifications that this umbrella term presently includes. The second chapter focuses diachronically on various theories’ contribution to the application of Corpus Linguistics’ methods and tools to the field of Translation Studies (i.e. Descriptive Translation Studies, Polysystem Theory). In particular, we discussed how the use of corpora can favourably help reduce the gap existing between qualitative and quantitative approaches. Subsequently, we reviewed the tools traditionally employed by Corpus Linguistics in regard to the construction of traditional “written language” corpora, to assess whether and how they can be adapted to meet the needs of multimedia corpora. In particular, we reviewed existing speech and spoken corpora, as well as multimedia corpora specifically designed to investigate Translation. The third chapter reviews Forlixt 1's main developing steps, from a technical (IT design principles, data query functions) and methodological point of view, by laying down extensive scientific foundations for the annotation methods adopted, which presently encompass categories of pragmatic, sociolinguistic, linguacultural and semiotic nature. Finally, we described the main query tools (free search, guided search, advanced search and combined search) and the main intended uses of the database in a pedagogical perspective. The fourth chapter lists specific compilation criteria retained, as well as statistics of the two sub-corpora, by presenting data broken down by language pair (French-Italian and German-Italian) and genre (cinema’s comedies, television’s soapoperas and crime series). Next, we concentrated on the discussion of the results obtained from the analysis of summary tables reporting the frequency of categories applied to the French-Italian sub-corpus. The detailed observation of the distribution of categories identified in the original and dubbed corpus allowed us to empirically confirm some of the theories put forward in the literature and notably concerning the nature of the filmic text, the dubbing process and Italian dubbed language’s features. This was possible by looking into some of the most problematic aspects, like the rendering of socio-linguistic variation. The corpus equally allowed us to consider so far neglected aspects, such as pragmatic, prosodic, kinetic, facial, and semiotic elements, and their combination. At the end of this first exploration, some specific observations concerning possible macrotranslation trends were made for each type of sub-genre considered (cinematic and TV genre). On the grounds of this first quantitative investigation, the fifth chapter intended to further examine data, by applying ad hoc models of analysis. Given the virtually infinite number of combinations of categories adopted, and of the latter with searchable textual units, three possible qualitative and quantitative methods were designed, each of which was to concentrate on a particular translation dimension of the filmic text. The first one was the cultural dimension, which specifically focused on the rendering of selected cultural references and on the investigation of recurrent translation choices and strategies justified on the basis of the occurrence of specific clusters of categories. The second analysis was conducted on the linguistic dimension by exploring the occurrence of phrasal verbs in the Italian dubbed corpus and by ascertaining the influence on the adoption of related translation strategies of possible semiotic traits, such as gestures and facial expressions. Finally, the main aim of the third study was to verify whether, under which circumstances, and through which modality, graphic and iconic elements were translated into Italian from an original corpus of both German and French films. After having reviewed the main translation techniques at work, an exhaustive account of possible causes for their non-translation was equally provided. By way of conclusion, the discussion of results obtained from the distribution of annotation categories on the French-Italian corpus, as well as the application of specific models of analysis allowed us to underline possible advantages and drawbacks related to the adoption of a corpus-based approach to AVT studies. Even though possible updating and improvement were proposed in order to help solve some of the problems identified, it is argued that the added value of Forlixt 1 lies ultimately in having created a valuable instrument, allowing to carry out empirically-sound contrastive studies that may be usefully replicated on different language pairs and several types of multimedia texts. Furthermore, multimedia corpora can also play a crucial role in L2 and translation teaching, two disciplines in which their use still lacks systematic investigation.
Resumo:
This thesis is concerned with the role played by software tools in the analysis and dissemination of linguistic corpora and their contribution to a more widespread adoption of corpora in different fields. Chapter 1 contains an overview of some of the most relevant corpus analysis tools available today, presenting their most interesting features and some of their drawbacks. Chapter 2 begins with an explanation of the reasons why none of the available tools appear to satisfy the requirements of the user community and then continues with technical overview of the current status of the new system developed as part of this work. This presentation is followed by highlights of features that make the system appealing to users and corpus builders (i.e. scholars willing to make their corpora available to the public). The chapter concludes with an indication of future directions for the projects and information on the current availability of the software. Chapter 3 describes the design of an experiment devised to evaluate the usability of the new system in comparison to another corpus tool. Usage of the tool was tested in the context of a documentation task performed on a real assignment during a translation class in a master's degree course. In chapter 4 the findings of the experiment are presented on two levels of analysis: firstly a discussion on how participants interacted with and evaluated the two corpus tools in terms of interface and interaction design, usability and perceived ease of use. Then an analysis follows of how users interacted with corpora to complete the task and what kind of queries they submitted. Finally, some general conclusions are drawn and areas for future work are outlined.
Resumo:
La tesi si articola in quattro parti. La prima, di stampo femminista, propone una panoramica sul femminicidio come fenomeno sociale e sulla relativa situazione giuridica internazionale. La seconda tratta in generale la stampa di qualità, supporto mediatico prescelto per l'analisi linguistica. La terza parte propone un micro-corpus di stampa italiana sul tema del femminicidio e la quarta un micro-corpus di stampa francese sull' "Affaire DSK", entrambe corredate di un' analisi del componente lessicale e discorsivo (Analyse du discours). E' un lavoro comparativo, i cui risultati hanno permesso di mettere in evidenza e provare come la stampa di qualità italiana e francese tendano a veicolare in modo implicito un'immagine sessista, sterotipata e discriminatoria del femminicidio e della vittima di violenza.
Resumo:
Contacts between languages have always led to mutual influence. Today, the position of authority of the English language affects Italian in many ways, especially in the scientific and technical fields. When new studies conceived in the English-speaking world reach the Italian public, we are faced not only with the translation of texts, but most importantly the rendition of theoretical constructs that do not always have a suitable rendering in the target language. That is why we often find anglicisms in Italian texts. This work aims to show their frequency in a specific field, underlying how and when they are used, and sometimes preferred to the Italian corresponding word. This dissertation looks at a sample of essays from the specialised magazine “Lavoro Sociale”, published by Edizioni Centro Studi Erickson, searching for borrowings from English and discussing their use in order to make hypotheses on the reasons of this phenomenon, against the wider background of translation studies and translation universals research. What I am more interested in is the understanding of the similarities and differences in the use of anglicisms by authors of Italian texts and translators from English into Italian, so that I can figure out what the main dynamics and tendencies are. The whole paper is has four parts. Chapter 1 briefly explains the theoretical background on translation studies, and introduces and discusses the notion of translation universals. After that, the research methodology and theoretical background on linguistic borrowings (especially anglicisms) in Italian are summarized. Chapter 2 presents the study, explaining the organisation of the material, the methodology used and the object of interest. Chapter 3 is the core of the dissertation because it contains the qualitative and quantitative data taken from the texts and the examination of the dynamics of the use of anglicisms. Finally, Chapter 4 compares the conclusions drawn from the previous chapter with the opinions of authors, translators and proof-readers, whom I asked to answer a questionnaire written specifically to investigate the mechanisms and choices behind their use of anglicisms.
Resumo:
The aim of this dissertation is to analyze the language of evaluation in Italian, English and French sustainability reports in order to observe how firms build their corporate image and to investigate the kind of relationship they develop with their stakeholders. The analysis is carried out by applying Martin & White's Appraisal theory and corpus linguistics methods. For the purposes of this research, a multilingual specialized corpus of sustainability reports has been created, which is the result of two different levels of compilation. At the first level, three sub-corpora have been created with the aim of representing three different languages (Italian, English and French): at this level, the research on evaluative language will show that a standardization process of sustainability reports is underway. At the second level of compilation, each of the three sub-corpora has been split in two further sub-corpora, representative of two different business sectors: at this level, the research will show how the sector where firms operate directly influences the choice of the topics to be discussed. The first chapter of this dissertation introduces the concept of evaluative language, with a particular focus on the framework of Appraisal theory. The second chapter deals with corpus linguistics and describes different types of corpora, the search methods and the criteria for the compilation of corpora. The third chapter discusses the concepts of Corporate Social Responsibility and sustainability reports, focusing mainly on the reporting principles and the linguistic patterns of this genre, and provides an overview of the main guidelines and certifications for the reporting of sustainability actions. Chapter four is dedicated to the description of the methodology used for this research, while the last chapter presents and discusses the results of the analysis, in an attempt to draw generalizations on the use of evaluative language in this emerging genre.