571 resultados para Corpora Pedunculata


20.00% 20.00%



This article presents an investigation of four linguistic phenomena of the Portuguese in different synchronicities. These research are about construction, description and analysis of corpora for the creation of databases that support linguistic description.


20.00% 20.00%



Fundacao de Amparo a Pesquisa do Estado de Sao Paulo (FAPESP)


20.00% 20.00%



[EN]Qualitative and quantitative research approaches are often considered as incompatible, and when they are brought together in a study, the analyses often stay within the realm of the same research field. The study at hand aims at combining the two methods from the perspectives of different disciplines and tries to determine to which degree a corpus-based analysis might support the traditional content-focused approach to qualitative data and render additional results.


20.00% 20.00%



This study aims to the elaboration of juridical and administrative terminology in Ladin language, actually on the Ladin idiom spoken in Val Badia. The necessity of this study is strictly connected to the fact that in South Tyrol the Ladin language is not just safeguarded, but the editing of administrative and normative text is guaranteed by law. This means that there is a need for a unique terminology in order to support translators and editors of specialised texts. The starting point of this study are, on one side the need of a unique terminology, and on the other side the translation work done till now from the employees of the public administration in Ladin language. In order to document their efforts a corpus made up of digitalized administrative and normative documents was build. The first two chapters focuses on the state of the art of projects on terminology and corpus linguistics for lesser used languages. The information were collected thanks to the help of institutes, universities and researchers dealing with lesser used languages. The third chapter focuses on the development of administrative language in Ladin language and the fourth chapter focuses on the creation of the trilingual Italian – German – Ladin corpus made up of administrative and normative documents. The last chapter deals with the methodologies applied in order to elaborate the terminology entries in Ladin language though the use of the trilingual corpus. Starting from the terminology entry all steps are described, from term extraction, to the extraction of equivalents, contexts and definitions and of course also of the elaboration of translation proposals for not found equivalences. Finally the problems referring to the elaboration of terminology in Ladin language are illustrated.


20.00% 20.00%



The construction and use of multimedia corpora has been advocated for a while in the literature as one of the expected future application fields of Corpus Linguistics. This research project represents a pioneering experience aimed at applying a data-driven methodology to the study of the field of AVT, similarly to what has been done in the last few decades in the macro-field of Translation Studies. This research was based on the experience of Forlixt 1, the Forlì Corpus of Screen Translation, developed at the University of Bologna’s Department of Interdisciplinary Studies in Translation, Languages and Culture. As a matter of fact, in order to quantify strategies of linguistic transfer of an AV product, we need to take into consideration not only the linguistic aspect of such a product but all the meaning-making resources deployed in the filmic text. Provided that one major benefit of Forlixt 1 is the combination of audiovisual and textual data, this corpus allows the user to access primary data for scientific investigation, and thus no longer rely on pre-processed material such as traditional annotated transcriptions. Based on this rationale, the first chapter of the thesis sets out to illustrate the state of the art of research in the disciplinary fields involved. The primary objective was to underline the main repercussions on multimedia texts resulting from the interaction of a double support, audio and video, and, accordingly, on procedures, means, and methods adopted in their translation. By drawing on previous research in semiotics and film studies, the relevant codes at work in visual and acoustic channels were outlined. Subsequently, we concentrated on the analysis of the verbal component and on the peculiar characteristics of filmic orality as opposed to spontaneous dialogic production. In the second part, an overview of the main AVT modalities was presented (dubbing, voice-over, interlinguistic and intra-linguistic subtitling, audio-description, etc.) in order to define the different technologies, processes and professional qualifications that this umbrella term presently includes. The second chapter focuses diachronically on various theories’ contribution to the application of Corpus Linguistics’ methods and tools to the field of Translation Studies (i.e. Descriptive Translation Studies, Polysystem Theory). In particular, we discussed how the use of corpora can favourably help reduce the gap existing between qualitative and quantitative approaches. Subsequently, we reviewed the tools traditionally employed by Corpus Linguistics in regard to the construction of traditional “written language” corpora, to assess whether and how they can be adapted to meet the needs of multimedia corpora. In particular, we reviewed existing speech and spoken corpora, as well as multimedia corpora specifically designed to investigate Translation. The third chapter reviews Forlixt 1's main developing steps, from a technical (IT design principles, data query functions) and methodological point of view, by laying down extensive scientific foundations for the annotation methods adopted, which presently encompass categories of pragmatic, sociolinguistic, linguacultural and semiotic nature. Finally, we described the main query tools (free search, guided search, advanced search and combined search) and the main intended uses of the database in a pedagogical perspective. The fourth chapter lists specific compilation criteria retained, as well as statistics of the two sub-corpora, by presenting data broken down by language pair (French-Italian and German-Italian) and genre (cinema’s comedies, television’s soapoperas and crime series). Next, we concentrated on the discussion of the results obtained from the analysis of summary tables reporting the frequency of categories applied to the French-Italian sub-corpus. The detailed observation of the distribution of categories identified in the original and dubbed corpus allowed us to empirically confirm some of the theories put forward in the literature and notably concerning the nature of the filmic text, the dubbing process and Italian dubbed language’s features. This was possible by looking into some of the most problematic aspects, like the rendering of socio-linguistic variation. The corpus equally allowed us to consider so far neglected aspects, such as pragmatic, prosodic, kinetic, facial, and semiotic elements, and their combination. At the end of this first exploration, some specific observations concerning possible macrotranslation trends were made for each type of sub-genre considered (cinematic and TV genre). On the grounds of this first quantitative investigation, the fifth chapter intended to further examine data, by applying ad hoc models of analysis. Given the virtually infinite number of combinations of categories adopted, and of the latter with searchable textual units, three possible qualitative and quantitative methods were designed, each of which was to concentrate on a particular translation dimension of the filmic text. The first one was the cultural dimension, which specifically focused on the rendering of selected cultural references and on the investigation of recurrent translation choices and strategies justified on the basis of the occurrence of specific clusters of categories. The second analysis was conducted on the linguistic dimension by exploring the occurrence of phrasal verbs in the Italian dubbed corpus and by ascertaining the influence on the adoption of related translation strategies of possible semiotic traits, such as gestures and facial expressions. Finally, the main aim of the third study was to verify whether, under which circumstances, and through which modality, graphic and iconic elements were translated into Italian from an original corpus of both German and French films. After having reviewed the main translation techniques at work, an exhaustive account of possible causes for their non-translation was equally provided. By way of conclusion, the discussion of results obtained from the distribution of annotation categories on the French-Italian corpus, as well as the application of specific models of analysis allowed us to underline possible advantages and drawbacks related to the adoption of a corpus-based approach to AVT studies. Even though possible updating and improvement were proposed in order to help solve some of the problems identified, it is argued that the added value of Forlixt 1 lies ultimately in having created a valuable instrument, allowing to carry out empirically-sound contrastive studies that may be usefully replicated on different language pairs and several types of multimedia texts. Furthermore, multimedia corpora can also play a crucial role in L2 and translation teaching, two disciplines in which their use still lacks systematic investigation.


20.00% 20.00%



This thesis is concerned with the role played by software tools in the analysis and dissemination of linguistic corpora and their contribution to a more widespread adoption of corpora in different fields. Chapter 1 contains an overview of some of the most relevant corpus analysis tools available today, presenting their most interesting features and some of their drawbacks. Chapter 2 begins with an explanation of the reasons why none of the available tools appear to satisfy the requirements of the user community and then continues with technical overview of the current status of the new system developed as part of this work. This presentation is followed by highlights of features that make the system appealing to users and corpus builders (i.e. scholars willing to make their corpora available to the public). The chapter concludes with an indication of future directions for the projects and information on the current availability of the software. Chapter 3 describes the design of an experiment devised to evaluate the usability of the new system in comparison to another corpus tool. Usage of the tool was tested in the context of a documentation task performed on a real assignment during a translation class in a master's degree course. In chapter 4 the findings of the experiment are presented on two levels of analysis: firstly a discussion on how participants interacted with and evaluated the two corpus tools in terms of interface and interaction design, usability and perceived ease of use. Then an analysis follows of how users interacted with corpora to complete the task and what kind of queries they submitted. Finally, some general conclusions are drawn and areas for future work are outlined.


20.00% 20.00%



La tesi si articola in quattro parti. La prima, di stampo femminista, propone una panoramica sul femminicidio come fenomeno sociale e sulla relativa situazione giuridica internazionale. La seconda tratta in generale la stampa di qualità, supporto mediatico prescelto per l'analisi linguistica. La terza parte propone un micro-corpus di stampa italiana sul tema del femminicidio e la quarta un micro-corpus di stampa francese sull' "Affaire DSK", entrambe corredate di un' analisi del componente lessicale e discorsivo (Analyse du discours). E' un lavoro comparativo, i cui risultati hanno permesso di mettere in evidenza e provare come la stampa di qualità italiana e francese tendano a veicolare in modo implicito un'immagine sessista, sterotipata e discriminatoria del femminicidio e della vittima di violenza.


20.00% 20.00%



Contacts between languages have always led to mutual influence. Today, the position of authority of the English language affects Italian in many ways, especially in the scientific and technical fields. When new studies conceived in the English-speaking world reach the Italian public, we are faced not only with the translation of texts, but most importantly the rendition of theoretical constructs that do not always have a suitable rendering in the target language. That is why we often find anglicisms in Italian texts. This work aims to show their frequency in a specific field, underlying how and when they are used, and sometimes preferred to the Italian corresponding word. This dissertation looks at a sample of essays from the specialised magazine “Lavoro Sociale”, published by Edizioni Centro Studi Erickson, searching for borrowings from English and discussing their use in order to make hypotheses on the reasons of this phenomenon, against the wider background of translation studies and translation universals research. What I am more interested in is the understanding of the similarities and differences in the use of anglicisms by authors of Italian texts and translators from English into Italian, so that I can figure out what the main dynamics and tendencies are. The whole paper is has four parts. Chapter 1 briefly explains the theoretical background on translation studies, and introduces and discusses the notion of translation universals. After that, the research methodology and theoretical background on linguistic borrowings (especially anglicisms) in Italian are summarized. Chapter 2 presents the study, explaining the organisation of the material, the methodology used and the object of interest. Chapter 3 is the core of the dissertation because it contains the qualitative and quantitative data taken from the texts and the examination of the dynamics of the use of anglicisms. Finally, Chapter 4 compares the conclusions drawn from the previous chapter with the opinions of authors, translators and proof-readers, whom I asked to answer a questionnaire written specifically to investigate the mechanisms and choices behind their use of anglicisms.


20.00% 20.00%



The aim of this dissertation is to analyze the language of evaluation in Italian, English and French sustainability reports in order to observe how firms build their corporate image and to investigate the kind of relationship they develop with their stakeholders. The analysis is carried out by applying Martin & White's Appraisal theory and corpus linguistics methods. For the purposes of this research, a multilingual specialized corpus of sustainability reports has been created, which is the result of two different levels of compilation. At the first level, three sub-corpora have been created with the aim of representing three different languages (Italian, English and French): at this level, the research on evaluative language will show that a standardization process of sustainability reports is underway. At the second level of compilation, each of the three sub-corpora has been split in two further sub-corpora, representative of two different business sectors: at this level, the research will show how the sector where firms operate directly influences the choice of the topics to be discussed. The first chapter of this dissertation introduces the concept of evaluative language, with a particular focus on the framework of Appraisal theory. The second chapter deals with corpus linguistics and describes different types of corpora, the search methods and the criteria for the compilation of corpora. The third chapter discusses the concepts of Corporate Social Responsibility and sustainability reports, focusing mainly on the reporting principles and the linguistic patterns of this genre, and provides an overview of the main guidelines and certifications for the reporting of sustainability actions. Chapter four is dedicated to the description of the methodology used for this research, while the last chapter presents and discusses the results of the analysis, in an attempt to draw generalizations on the use of evaluative language in this emerging genre.


20.00% 20.00%



Il punto di partenza del presente lavoro di ricerca terminologica è stato il soggiorno formativo presso la Direzione generale della Traduzione (DGT) della Commissione Europea di Lussemburgo. Il progetto di tirocinio, ovvero l’aggiornamento e la revisione di schede IATE afferenti al dominio finanziario, e gli aspetti problematici riscontrati durante la compilazione di tali schede hanno portato alla definizione della presente tesi. Lo studio si prefigge di analizzare la ricezione della terminologia precipua della regolamentazione di Basilea 3, esaminando il fenomeno della variazione linguistica in corpora italiani e tedeschi. Nel primo capitolo si descrive brevemente l’esperienza di tirocinio svolto presso la DGT, si presenta la banca dati IATE, l’attività terminologica eseguita e si illustrano le considerazioni che hanno portato allo sviluppo del progetto di tesi. Nel secondo capitolo si approfondisce il dominio investigato, descrivendo a grandi linee la crisi finanziaria che ha portato alla redazione della nuova normativa di Basilea 3, e si presentano i punti fondamentali degli Accordi di Basilea 3. Il terzo capitolo offre una panoramica sulle caratteristiche del linguaggio economico-finanziario e sulle conseguenze della nuova regolamentazione dal punto di vista linguistico, sottolineando le peculiarità della terminologia analizzata. Nel quarto capitolo si descrivono la metodologia seguita e le risorse utilizzate per il progetto di tesi, ovvero corpora ad hoc in lingua italiana e tedesca per l’analisi dei termini e le relative schede terminologiche. Il quinto capitolo si concentra sul fenomeno della variazione linguistica, fornendo un quadro teorico dei diversi approcci alla terminologia, cui segue l’analisi dei corpora e il commento dei risultati ottenuti; si considerano quindi le riflessioni teoriche alla luce di quanto emerso dalla disamina dei corpora. Infine, nell'appendice sono riportate le schede terminologiche IATE compilate durante il periodo di tirocinio e le schede terminologiche redatte a seguito dell’analisi del presente elaborato.


20.00% 20.00%



L’obiettivo del presente lavoro è illustrare la creazione di due corpora bilingui italiano-inglese di libretti d’opera di Giuseppe Verdi, annotati e indicizzati, e descrivere le potenzialità di queste risorse. Il progetto è nato dalla volontà di indagare l’effettiva possibilità di gestione e consultazione di testi poetici tramite corpora in studi translation-driven, optando in particolare per il genere libretto d’opera in considerazione della sua complessità, derivante anche dal fatto che il contenuto testuale è fortemente condizionato dalla musica. Il primo corpus, chiamato LiVeGi, si compone di cinque opere di Giuseppe Verdi e relativa traduzione inglese: Ernani, Il Trovatore, La Traviata, Aida e Falstaff; il secondo corpus, nominato FaLiVe, contiene l’originale italiano dell’opera Falstaff e due traduzioni inglesi, realizzate a circa un secolo di distanza l’una dall’altra. All’analisi del genere libretto e delle caratteristiche principali delle cinque opere selezionate (Capitolo 1), segue una panoramica della prassi traduttiva dei lavori verdiani nel Regno Unito e negli Stati Uniti (Capitolo 2) e la presentazione delle nozioni di Digital Humanities e di linguistica computazionale, all’interno delle quali si colloca il presente studio (Capitolo 3). La sezione centrale (Capitolo 4) presenta nel dettaglio tutte le fasi pratiche di creazione dei due corpora, in particolare selezione e reperimento del materiale, OCR, ripulitura, annotazione e uniformazione dei metacaratteri, part-of-speech tagging, indicizzazione e allineamento, terminando con la descrizione delle risorse ottenute. Il lavoro si conclude (Capitolo 5) con l’illustrazione delle potenzialità dei due corpora creati e le possibilità di ricerca offerte, presentando, a titolo d’esempio, due case study: il linguaggio delle protagoniste tragiche nei libretti di Verdi in traduzione (studio realizzato sul corpus LiVeGi) e la traduzione delle ingiurie nel Falstaff (attraverso il corpus FaLiVe).


20.00% 20.00%



In questo elaborato vengono presentate la traduzione di un estratto di un romanzo di Terry Pratchett, The fifth elephant, e l’analisi di questa traduzione, realizzata con l’aiuto di un corpus costruito ad hoc per questa ricerca. Nel corpus sono state inserite le traduzioni italiane di otto romanzi differenti, scritti dallo stesso autore, Terry Pratchett, e tradotti da tre traduttrici professioniste. Questo corpus è stato ideato appositamente per coadiuvare il processo di traduzione dell’estratto e per esemplificare un metodo di analisi del lavoro di traduttori professionisti. Questo tipo di analisi, nota come translational stylistics, ha lo scopo di identificare le differenze esistenti a livello stilistico tra i traduttori, cercando nello specifico quegli elementi che permettano di identificare e distinguere tra loro i lavori di un determinato traduttore da quelli di un altro, a prescindere dal testo di partenza. Questo elaborato si apre con la descrizione tecnica di un corpus, spiegandone gli utilizzi nel campo della ricerca e i metodi di costruzione. In seguito viene fornita una panoramica dell’autore e delle sue opere, e vengono fornite informazioni sulle traduzioni in italiano di questi romanzi e sulle traduttrici che le hanno elaborate. Viene quindi presentata la traduzione dell’estratto, seguita da un’analisi dei problemi traduttivi affrontati durante il processo di traduzione e di come il corpus abbia aiutato a risolvere e superare queste difficoltà. Infine viene presentato uno studio di caso sulla translational stylistics che mostra le differenze a livello di stile esistenti tra i lavori di traduttori diversi.


20.00% 20.00%



Las nuevas tecnologías y el procesamiento digital han facilitado considerablemente la lingüística de corpus, por ejemplo Internet es una herramienta fácil y barata para recopilar corpus. Internet es cada vez más popular y más importante para la comunicación a causa de la enorme influencia de los nuevos medios y ha afectado la vida y la sociedad de muchas maneras y en parte, de manera fundamental. No sorprende por eso que la lengua y la comunicación misma se vean afectadas. Uno de los fenómenos más interesantes dentro de la comunicación mediada por ordenadores (CMC) son las redes sociales en línea, que en pocos años se han convertido en un medio de comunicación muy difundido y en expansión continua. Su estudio es particularmente interesante porque debido al desarrollo constante de la tecnología las redes sociales en línea no son una entidad estática, sino que cambian incesantemente, introduciéndose frecuentemente novedades para su uso. Estas novedades están condicionadas por el medio electrónico que a su vez influye decisivamente en el estilo de comunicación empleado en redes sociales como Facebook y Tuenti. Al ser un nuevo medio de interacción social, las redes sociales en línea producen un estilo de comunicación propio. El objetivo de análisis de mi tesis es cómo los usuarios de Facebook y Tuenti de la ciudad de Málaga crean este estilo mediante el uso de rasgos fónicos propios de la variedad andaluza y de qué manera la actitud lingüística de los usuarios influye en el uso de dichos rasgos fónicos. Este estudio se basa en un corpus elaborado a partir de enunciados de informantes en Facebook y Tuenti. Un corpus constituido por transcripciones amplias de grabaciones de hablantes malagueños me sirve de corpus de comparación. Otra herramienta metodológica empleada para recopilar datos será la encuesta: un tipo de encuesta estará destinada a captar las actitudes de los participantes frente a diversos rasgos del habla andaluza/malagueña y otro a examinar por qué la gente utiliza estos rasgos en Facebook y Tuenti. Este estudio se apoya en los resultados de un estudio piloto que muestran que los factores sociales y lingüísticos analizados funcionan de manera distinta en el habla real y virtual. Debido a estos usos diferentes podemos considerar la comunicación electrónica de Facebook y Tuenti como un estilo condicionado por el tipo de espacio virtual. Se trata de un estilo que sirve a los usuarios para crear significado social y para expresar sus identidades a partir de la lengua.


20.00% 20.00%



Das Internet wird ein immer populäreres und wichtigeres Kommunikationsmittel, besonders die sogenannten Social-Networking-Sites. In dieser Studie wird untersucht wie die Social-Networking-Sites, Facebook und Tuenti, die Kommunikation beeinflussen. In einem Korpus von Usern aus Málaga, wurde der Gebrauch von nicht-Standard Merkmalen analysiert und mit dem in der gesprochenen Sprache verglichen. Aus diesem Vergleich lässt sich schließen, dass die untersuchten sozialen und linguistischen Faktoren in der virtuellen und der reellen Sprache unterschiedlich funktionieren. Aufgrund dieses unterschiedlichen Gebrauchs kann die elektronische Kommunikation Facebook und Tuenti’s als Stil betrachtet werden, welcher den Usern dazu dient, soziale Bedeutung zu kreieren und ihre sprachliche Identität auszudrücken.


20.00% 20.00%



Software corpora facilitate reproducibility of analyses, however, static analysis for an entire corpus still requires considerable effort, often duplicated unnecessarily by multiple users. Moreover, most corpora are designed for single languages increasing the effort for cross-language analysis. To address these aspects we propose Pangea, an infrastructure allowing fast development of static analyses on multi-language corpora. Pangea uses language-independent meta-models stored as object model snapshots that can be directly loaded into memory and queried without any parsing overhead. To reduce the effort of performing static analyses, Pangea provides out-of-the box support for: creating and refining analyses in a dedicated environment, deploying an analysis on an entire corpus, using a runner that supports parallel execution, and exporting results in various formats. In this tool demonstration we introduce Pangea and provide several usage scenarios that illustrate how it reduces the cost of analysis.