969 resultados para linguistic corpora


Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study aims at discussing aspects related to learner corpora and linguistic features found in texts written by English learners based on the use of collocations in text production. For this research, we analyzed collocations with the verb “to have” and with the nouns “prejudice” and “regret”.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article presents an investigation of four linguistic phenomena of the Portuguese in different synchronicities. These research are about construction, description and analysis of corpora for the creation of databases that support linguistic description.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The starting point of this article is the question "How to retrieve fingerprints of rhythm in written texts?" We address this problem in the case of Brazilian and European Portuguese. These two dialects of Modern Portuguese share the same lexicon and most of the sentences they produce are superficially identical. Yet they are conjectured, on linguistic grounds, to implement different rhythms. We show that this linguistic question can be formulated as a problem of model selection in the class of variable length Markov chains. To carry on this approach, we compare texts from European and Brazilian Portuguese. These texts are previously encoded according to some basic rhythmic features of the sentences which can be automatically retrieved. This is an entirely new approach from the linguistic point of view. Our statistical contribution is the introduction of the smallest maximizer criterion which is a constant free procedure for model selection. As a by-product, this provides a solution for the problem of optimal choice of the penalty constant when using the BIC to select a variable length Markov chain. Besides proving the consistency of the smallest maximizer criterion when the sample size diverges, we also make a simulation study comparing our approach with both the standard BIC selection and the Peres-Shields order estimation. Applied to the linguistic sample constituted for our case study, the smallest maximizer criterion assigns different context-tree models to the two dialects of Portuguese. The features of the selected models are compatible with current conjectures discussed in the linguistic literature.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The construction and use of multimedia corpora has been advocated for a while in the literature as one of the expected future application fields of Corpus Linguistics. This research project represents a pioneering experience aimed at applying a data-driven methodology to the study of the field of AVT, similarly to what has been done in the last few decades in the macro-field of Translation Studies. This research was based on the experience of Forlixt 1, the Forlì Corpus of Screen Translation, developed at the University of Bologna’s Department of Interdisciplinary Studies in Translation, Languages and Culture. As a matter of fact, in order to quantify strategies of linguistic transfer of an AV product, we need to take into consideration not only the linguistic aspect of such a product but all the meaning-making resources deployed in the filmic text. Provided that one major benefit of Forlixt 1 is the combination of audiovisual and textual data, this corpus allows the user to access primary data for scientific investigation, and thus no longer rely on pre-processed material such as traditional annotated transcriptions. Based on this rationale, the first chapter of the thesis sets out to illustrate the state of the art of research in the disciplinary fields involved. The primary objective was to underline the main repercussions on multimedia texts resulting from the interaction of a double support, audio and video, and, accordingly, on procedures, means, and methods adopted in their translation. By drawing on previous research in semiotics and film studies, the relevant codes at work in visual and acoustic channels were outlined. Subsequently, we concentrated on the analysis of the verbal component and on the peculiar characteristics of filmic orality as opposed to spontaneous dialogic production. In the second part, an overview of the main AVT modalities was presented (dubbing, voice-over, interlinguistic and intra-linguistic subtitling, audio-description, etc.) in order to define the different technologies, processes and professional qualifications that this umbrella term presently includes. The second chapter focuses diachronically on various theories’ contribution to the application of Corpus Linguistics’ methods and tools to the field of Translation Studies (i.e. Descriptive Translation Studies, Polysystem Theory). In particular, we discussed how the use of corpora can favourably help reduce the gap existing between qualitative and quantitative approaches. Subsequently, we reviewed the tools traditionally employed by Corpus Linguistics in regard to the construction of traditional “written language” corpora, to assess whether and how they can be adapted to meet the needs of multimedia corpora. In particular, we reviewed existing speech and spoken corpora, as well as multimedia corpora specifically designed to investigate Translation. The third chapter reviews Forlixt 1's main developing steps, from a technical (IT design principles, data query functions) and methodological point of view, by laying down extensive scientific foundations for the annotation methods adopted, which presently encompass categories of pragmatic, sociolinguistic, linguacultural and semiotic nature. Finally, we described the main query tools (free search, guided search, advanced search and combined search) and the main intended uses of the database in a pedagogical perspective. The fourth chapter lists specific compilation criteria retained, as well as statistics of the two sub-corpora, by presenting data broken down by language pair (French-Italian and German-Italian) and genre (cinema’s comedies, television’s soapoperas and crime series). Next, we concentrated on the discussion of the results obtained from the analysis of summary tables reporting the frequency of categories applied to the French-Italian sub-corpus. The detailed observation of the distribution of categories identified in the original and dubbed corpus allowed us to empirically confirm some of the theories put forward in the literature and notably concerning the nature of the filmic text, the dubbing process and Italian dubbed language’s features. This was possible by looking into some of the most problematic aspects, like the rendering of socio-linguistic variation. The corpus equally allowed us to consider so far neglected aspects, such as pragmatic, prosodic, kinetic, facial, and semiotic elements, and their combination. At the end of this first exploration, some specific observations concerning possible macrotranslation trends were made for each type of sub-genre considered (cinematic and TV genre). On the grounds of this first quantitative investigation, the fifth chapter intended to further examine data, by applying ad hoc models of analysis. Given the virtually infinite number of combinations of categories adopted, and of the latter with searchable textual units, three possible qualitative and quantitative methods were designed, each of which was to concentrate on a particular translation dimension of the filmic text. The first one was the cultural dimension, which specifically focused on the rendering of selected cultural references and on the investigation of recurrent translation choices and strategies justified on the basis of the occurrence of specific clusters of categories. The second analysis was conducted on the linguistic dimension by exploring the occurrence of phrasal verbs in the Italian dubbed corpus and by ascertaining the influence on the adoption of related translation strategies of possible semiotic traits, such as gestures and facial expressions. Finally, the main aim of the third study was to verify whether, under which circumstances, and through which modality, graphic and iconic elements were translated into Italian from an original corpus of both German and French films. After having reviewed the main translation techniques at work, an exhaustive account of possible causes for their non-translation was equally provided. By way of conclusion, the discussion of results obtained from the distribution of annotation categories on the French-Italian corpus, as well as the application of specific models of analysis allowed us to underline possible advantages and drawbacks related to the adoption of a corpus-based approach to AVT studies. Even though possible updating and improvement were proposed in order to help solve some of the problems identified, it is argued that the added value of Forlixt 1 lies ultimately in having created a valuable instrument, allowing to carry out empirically-sound contrastive studies that may be usefully replicated on different language pairs and several types of multimedia texts. Furthermore, multimedia corpora can also play a crucial role in L2 and translation teaching, two disciplines in which their use still lacks systematic investigation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Contacts between languages have always led to mutual influence. Today, the position of authority of the English language affects Italian in many ways, especially in the scientific and technical fields. When new studies conceived in the English-speaking world reach the Italian public, we are faced not only with the translation of texts, but most importantly the rendition of theoretical constructs that do not always have a suitable rendering in the target language. That is why we often find anglicisms in Italian texts. This work aims to show their frequency in a specific field, underlying how and when they are used, and sometimes preferred to the Italian corresponding word. This dissertation looks at a sample of essays from the specialised magazine “Lavoro Sociale”, published by Edizioni Centro Studi Erickson, searching for borrowings from English and discussing their use in order to make hypotheses on the reasons of this phenomenon, against the wider background of translation studies and translation universals research. What I am more interested in is the understanding of the similarities and differences in the use of anglicisms by authors of Italian texts and translators from English into Italian, so that I can figure out what the main dynamics and tendencies are. The whole paper is has four parts. Chapter 1 briefly explains the theoretical background on translation studies, and introduces and discusses the notion of translation universals. After that, the research methodology and theoretical background on linguistic borrowings (especially anglicisms) in Italian are summarized. Chapter 2 presents the study, explaining the organisation of the material, the methodology used and the object of interest. Chapter 3 is the core of the dissertation because it contains the qualitative and quantitative data taken from the texts and the examination of the dynamics of the use of anglicisms. Finally, Chapter 4 compares the conclusions drawn from the previous chapter with the opinions of authors, translators and proof-readers, whom I asked to answer a questionnaire written specifically to investigate the mechanisms and choices behind their use of anglicisms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of this dissertation is to analyze the language of evaluation in Italian, English and French sustainability reports in order to observe how firms build their corporate image and to investigate the kind of relationship they develop with their stakeholders. The analysis is carried out by applying Martin & White's Appraisal theory and corpus linguistics methods. For the purposes of this research, a multilingual specialized corpus of sustainability reports has been created, which is the result of two different levels of compilation. At the first level, three sub-corpora have been created with the aim of representing three different languages (Italian, English and French): at this level, the research on evaluative language will show that a standardization process of sustainability reports is underway. At the second level of compilation, each of the three sub-corpora has been split in two further sub-corpora, representative of two different business sectors: at this level, the research will show how the sector where firms operate directly influences the choice of the topics to be discussed. The first chapter of this dissertation introduces the concept of evaluative language, with a particular focus on the framework of Appraisal theory. The second chapter deals with corpus linguistics and describes different types of corpora, the search methods and the criteria for the compilation of corpora. The third chapter discusses the concepts of Corporate Social Responsibility and sustainability reports, focusing mainly on the reporting principles and the linguistic patterns of this genre, and provides an overview of the main guidelines and certifications for the reporting of sustainability actions. Chapter four is dedicated to the description of the methodology used for this research, while the last chapter presents and discusses the results of the analysis, in an attempt to draw generalizations on the use of evaluative language in this emerging genre.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Europarl is a large multilingual corpus containing the minutes of the debates at the European Parliament. This article presents a method to extract different corpora from Europarl: monolingual and multilingual comparable corpora, as well as parallel corpora. Using state-of-the-art measures of homogeneity, we show that these corpora are very similar. In addition, we argue that they present many advantages for research in various fields of linguistics and translation studies, and we also discuss some of their limitations. We conclude by reviewing a number of previous studies that made use of these corpora, emphasizing in each case the possibilities offered by Europarl.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Discourse connectives are lexical items indicating coherence relations between discourse segments. Even though many languages possess a whole range of connectives, important divergences exist cross-linguistically in the number of connectives that are used to express a given relation. For this reason, connectives are not easily paired with a univocal translation equivalent across languages. This paper is a first attempt to design a reliable method to annotate the meaning of discourse connectives cross-linguistically using corpus data. We present the methodological choices made to reach this aim and report three annotation experiments using the framework of the Penn Discourse Tree Bank.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

El proyecto Araknion tiene como objetivo general dotar al español y al catalán de una infraestructura básica de recursos lingüísticos para el procesamiento semántico de corpus en el marco de la Web 2.0 sean de origen oral o escrito.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Telenovela’s orality: from medium to a linguistic-discursive construction. Studies about telenovelas usually highlight their "orality". However, a literature review, specifically for Latin American telenovelas, shows that the term "orality" has been used with varying senses. In contrast with those devoted to telenovelas, literary studies have addressed the question by conceptualizing it as fictional orality. This paper takes fictional orality as a key concept to explain telenovela’s discursive peculiarities, and on that base, it distinguishes several dimensions of linguistic and discursive variation, in which such orality is being portrayed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Oggetto del presente studio è un'analisi del genere testuale del contratto di compravendita immobiliare negli ordinamenti di Italia, Germania e Austria in un'ottica sincronica e pragmatica. Il testo è considerato come un atto comunicativo legato a convenzioni prestabilite e volto ad assolvere a specifiche funzioni sociali. L'obbiettivo principale del lavoro è lo sviluppo di un modello di analisi testuale che possa evidenziare l'interazione tra la funzione primaria e l'assetto macro- e microstrutturale di questo genere testuale, ovvero tra il piano giuridico e quello linguistico-testuale. L'analisi svolta permette inoltre di confrontare tre sistemi giuridici rispetto alla modalità di attuare questo negozio, nonché le lingue italiana e tedesca ed altresì due varietà di quest'ultima. Il corpus è composto da 40 atti autentici e 9 atti da formulari, compresi in un arco temporale che va dal 2000 al 2018. L'analisi parte con la definizione delle coordinate intra-ed extratestuali che determinano questo genere testuale e da una sua classificazione all'interno dei testi dell'ambito giuridico. Su questa base, i contratti dei corpora di Italia, Germania e Austria vengono analizzati separatamente rispetto alla loro macrostruttura, comprendendo in ciò tre piani macrostrutturali, ovvero quello giuridico da un lato e quelli funzionale e tematico dall'altro. L'interazione tra la funzione giuridica e l'assetto linguistico-testuale del contratto di compravendita immobiliare emerge in particolare a livello di quello funzionale, ossia relativo alla sequenza delle funzioni linguistiche realizzate sulla base dei contenuti giuridici. I risultati evinti dall'analisi dei tre corpora sono, infine, messi a confronto e integrati con una classificazione delle forme verbali che caratterizzano determinati macro-ambiti d'uso/funzionali all'interno di questo genere testuale, ovvero la realizzazione di specifiche funzioni linguistiche e giuridiche. Il metodo proposto offre nuovi spunti per ricerche future, tanto nell'ambito della linguistica contrastiva applicata a testi specialistici, che della traduzione e linguistica giuridica.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In political debates, the media[tisation] can determine the use of language with the aim to increase their spectacularisation and polarisation, possibly by means of criticism and humour, respectively. These linguistic strategies are often used in order to shape what was defined by Goffman as one’s face. Politicians, in particular, can recur to facework in a double sense: shaping their own face positively and/or that of their opponents negatively. Starting from the sociologic theory of face by Goffman and Levinson, with the help of corpus analysis tools, this research investigated the ways in which various forms of criticism and forms of humour were conducted in 3 electoral debates on a national scale (Germany, Ireland, and New Zealand) and 1 debate for the municipal election in Rome. The transcripts were revised after automatic transcriptions were extracted or found online, of which the audio-visual content is available on the Internet. The CADS research aimed to investigate the role that criticism and humour played within each participant’s discourse, and to identify differences and similarities among the strategies used by political leaders and moderators in different countries, and in different cultural, political, and media contexts.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The present paper proposes a flexible consensus scheme for group decision making, which allows one to obtain a consistent collective opinion, from information provided by each expert in terms of multigranular fuzzy estimates. It is based on a linguistic hierarchical model with multigranular sets of linguistic terms, and the choice of the most suitable set is a prerogative of each expert. From the human viewpoint, using such model is advantageous, since it permits each expert to utilize linguistic terms that reflect more adequately the level of uncertainty intrinsic to his evaluation. From the operational viewpoint, the advantage of using such model lies in the fact that it allows one to express the linguistic information in a unique domain, without losses of information, during the discussion process. The proposed consensus scheme supposes that the moderator can interfere in the discussion process in different ways. The intervention can be a request to any expert to update his opinion or can be the adjustment of the weight of each expert`s opinion. An optimal adjustment can be achieved through the execution of an optimization procedure that searches for the weights that maximize a corresponding soft consensus index. In order to demonstrate the usefulness of the presented consensus scheme, a technique for multicriteria analysis, based on fuzzy preference relation modeling, is utilized for solving a hypothetical enterprise strategy planning problem, generated with the use of the Balanced Scorecard methodology. (C) 2009 Elsevier Inc. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Traditionally the basal ganglia have been implicated in motor behavior, as they are involved in both the execution of automatic actions and the modification of ongoing actions in novel contexts. Corresponding to cognition, the role of the basal ganglia has not been defined as explicitly. Relative to linguistic processes, contemporary theories of subcortical participation in language have endorsed a role for the globus pallidus internus (GPi) in the control of lexical-semantic operations. However, attempts to empirically validate these postulates have been largely limited to neuropsychological investigations of verbal fluency abilities subsequent to pallidotomy. We evaluated the impact of bilateral posteroventral pallidotomy (BPVP) on language function across a range of general and high-level linguistic abilities, and validated/extended working theories of pallidal participation in language. Comprehensive linguistic profiles were compiled up to 1 month before and 3 months after BPVP in 6 subjects with Parkinson's disease (PD). Commensurate linguistic profiles were also gathered over a 3-month period for a nonsurgical control cohort of 16 subjects with PD and a group of 16 non-neurologically impaired controls (NC). Nonparametric between-groups comparisons were conducted and reliable change indices calculated, relative to baseline/3-month follow-up difference scores. Group-wise statistical comparisons between the three groups failed to reveal significant postoperative changes in language performance. Case-by-case data analysis relative to clinically consequential change indices revealed reliable alterations in performance across several language variables as a consequence of BPVP. These findings lend support to models of subcortical participation in language, which promote a role for the GPi in lexical-semantic manipulation mechanisms. Concomitant improvements and decrements in postoperative performance were interpreted within the context of additive and subtractive postlesional effects. Relative to parkinsonian cohorts, clinically reliable versus statistically significant changes on a case by case basis may provide the most accurate method of characterizing the way in which pathophysiologically divergent basal ganglia linguistic circuits respond to BPVP.