984 resultados para Corpora as translation resources


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of this dissertation is to provide a translation from English into Italian of a specialised scientific article published in the Cambridge Working Papers in Economics series. In this text, the authors estimate the economic consequences of the earthquake that hit the Abruzzo region in 2009. An extract of this translation will be published as part of conference proceedings. The main reason behind this choice is a personal interest in specialised translation in the economic domain. Moreover, the subject of the article is of particular interest to the Italian readership. The aim of this study is to show how a non-specialised translator can tackle with such a highly specialised translation with the use of appropriate terminology resources and the collaboration of field experts. The translation could be of help to other Italian linguists looking for translated material in this particular domain where English seems to be the dominant language. In order to ensure consistent terminology and adequate style, the document has been translated with the use of different resources, such as dictionaries, glossaries and specialised corpora. I also contacted field experts and the authors of text. The collaboration with the authors proved to be an invaluable resource yet one to be carefully managed. This work is divided into 5 chapters. The first deals with domain-specific sublanguages. The second gives an overview of corpus linguistics and describes the corpora designed for the translation. The third provides an analysis of the article, focusing on syntactical, lexical and structural features while the fourth presents the translation, side-by-side with the source text. The fifth comments on the main difficulties encountered in the translation and the strategies used, as well as the relationship with the authors and their review of the published text. Appendix I contains the econometric glossary English – Italian.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of this dissertation is to provide a trilingual translation from English into Italian and from Italian into Spanish of a policy statement from the Fédération Internationale de l’Automobile (FIA) regarding road safety. The document, named “Formula Zero: a strategy for reducing fatalities and injuries on track and road”, was published in June 2000 and involves an approach about road safety inspired by another approach introduced in Sweden called ‘Vision Zero’. This work consists of six sections. The first chapter introduces the main purposes and activities of the Federation, as well as the institutions related to it and Vision Zero. The second chapter presents the main lexical, morphosyntactic and stylistic features of the institutional texts and special languages. In particular, the text contains technical nomenclature of transports and elements of sport language, especially regarding motor sport and Formula One. In the third chapter, the methodology is explained, with all the resources used during the preliminary phase and the translation, including corpora, glossaries, expert consultancy and specialised sites. The fourth chapter focuses on the morphosyntactic and terminology features contained in the text, while the fifth chapter presents the source text and the target texts. The final chapter deals with all the translation strategies that are applied, alongside with all the challenging elements detected. Therefore, the dissertation concludes with some theoretical and practical considerations about the role of inverse translation and English as Lingua Franca (ELF), by comparing the text translated into Spanish to the original in English, using Italian as a lingua franca.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of this dissertation is to provide an adequate translation from English into Italian of a section of the European Commission's site, concerning an environmental policy tool whose aim is to reduce the EU greenhouse gas emissions, the Emissions Trading System. The main reason behind this choice was the intention to combine a personal interest in the domain of sustainability development with the desire to delve deeper into the knowledge of the different aspects involved in the localisation process. I also had the possibility to combine these two with my interest in the universe of the European Union. I therefore worked on the particular language of this supranational organisation and for this reason I had the opportunity to experience a very stimulating work placement at the Directorate-General for Translation in Brussels. However, the choice of the text was personal and the translation is not intended for publication. The work is divided into six chapters. In the first chapter the text is contextualised within the framework of the EU, and its legislation on multilingualism. This has consequences on the languages that are used by the drafters of the official documents and on the languages used by translators. The text originates from those documents, but it needs to be adapted to different receivers. The second chapter investigates the process of website localisation. The third chapter offers an analysis of the source text and of the prospective target text. In the fourth chapter the resources created and used for the translation of the text are described. A comparison is made between the resources of the translation service of the European Commission and the ones created specifically for this project: a translation memory, exploited through the use of a CAT tool, and two corpora. The fifth chapter contains the actual translation, side-by-side with the source text, while the sixth one provides a comment on the translation strategies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

When particle flux is regulated by multiple factors such as particle supply and varying transport rate, it is important to identify the respective dominant regimes. We extend the well-studied totally asymmetric simple exclusion model to investigate the interplay between a controlled entrance and a local defect site. The model mimics cellular transport phenomena where there is typically a finite particle pool and nonuniform moving rates due to biochemical kinetics. Our simulations reveal regions where, despite an increasing particle supply, the current remains constant while particles redistribute in the system. Exploiting a domain wall approach with mean-field approximation, we provide a theoretical ground for our findings. The results in steady-state current and density profiles provide quantitative insights into the regulation of the transcription and translation process in bacterial protein synthesis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The various meanings of discourse connectives like while and however are difficult to identify and annotate, even for trained human annotators. This problem is all the more important that connectives are salient textual markers of cohesion and need to be correctly interpreted for many NLP applications. In this paper, we suggest an alternative route to reach a reliable annotation of connectives, by making use of the information provided by their translation in large parallel corpora. This method thus replaces the difficult explicit reasoning involved in traditional sense annotation by an empirical clustering of the senses emerging from the translations. We argue that this method has the advantage of providing more reliable reference data than traditional sense annotation. In addition, its simplicity allows for the rapid constitution of large annotated datasets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Discourse connectives are lexical items indicating coherence relations between discourse segments. Even though many languages possess a whole range of connectives, important divergences exist cross-linguistically in the number of connectives that are used to express a given relation. For this reason, connectives are not easily paired with a univocal translation equivalent across languages. This paper is a first attempt to design a reliable method to annotate the meaning of discourse connectives cross-linguistically using corpus data. We present the methodological choices made to reach this aim and report three annotation experiments using the framework of the Penn Discourse Tree Bank.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes methods and results for the annotation of two discourse-level phenomena, connectives and pronouns, over a multilingual parallel corpus. Excerpts from Europarl in English and French have been annotated with disambiguation information for connectives and pronouns, for about 3600 tokens. This data is then used in several ways: for cross-linguistic studies, for training automatic disambiguation software, and ultimately for training and testing discourse-aware statistical machine translation systems. The paper presents the annotation procedures and their results in detail, and overviews the first systems trained on the annotated resources and their use for machine translation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Ontologies and taxonomies are widely used to organize concepts providing the basis for activities such as indexing, and as background knowledge for NLP tasks. As such, translation of these resources would prove useful to adapt these systems to new languages. However, we show that the nature of these resources is significantly different from the "free-text" paradigm used to train most statistical machine translation systems. In particular, we see significant differences in the linguistic nature of these resources and such resources have rich additional semantics. We demonstrate that as a result of these linguistic differences, standard SMT methods, in particular evaluation metrics, can produce poor performance. We then look to the task of leveraging these semantics for translation, which we approach in three ways: by adapting the translation system to the domain of the resource; by examining if semantics can help to predict the syntactic structure used in translation; and by evaluating if we can use existing translated taxonomies to disambiguate translations. We present some early results from these experiments, which shed light on the degree of success we may have with each approach

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Language resources, such as multilingual lexica and multilingual electronic dictionaries, contain collections of lexical entries in several languages. Having access to the corresponding explicit or implicit translation relations between such entries might be of great interest for many NLP-based applications. By using Semantic Web-based techniques, translations can be available on the Web to be consumed by other (semantic enabled) resources in a direct manner, not relying on application-specific formats. To that end, in this paper we propose a model for representing translations as linked data, as an extension of the lemon model. Our translation module represents some core information associated to term translations and does not commit to specific views or translation theories. As a proof of concept, we have extracted the translations of the terms contained in Terminesp, a multilingual terminological database, and represented them as linked data. We have made them accessible on the Web both for humans (via a Web interface) and software agents (with a SPARQL endpoint).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recently, experts and practitioners in language resources have started recognizing the benefits of the linked data (LD) paradigm for the representation and exploitation of linguistic data on the Web. The adoption of the LD principles is leading to an emerging ecosystem of multilingual open resources that conform to the Linguistic Linked Open Data Cloud, in which datasets of linguistic data are interconnected and represented following common vocabularies, which facilitates linguistic information discovery, integration and access. In order to contribute to this initiative, this paper summarizes several key aspects of the representation of linguistic information as linked data from a practical perspective. The main goal of this document is to provide the basic ideas and tools for migrating language resources (lexicons, corpora, etc.) as LD on the Web and to develop some useful NLP tasks with them (e.g., word sense disambiguation). Such material was the basis of a tutorial imparted at the EKAW’14 conference, which is also reported in the paper.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper discusses the impact of machine translation on the language industry, specifically addressing its effect on translators. It summarizes the history of the development of machine translation, explains the underlying theory that ties machine translation to its practical applications, and describes the different types of machine translation as well as other tools familiar to translators. There are arguments for and against its use, as well as evaluation methods for testing it. Internet and real-time communication are featured for their role in the increase of machine translation use. The potential that this technology has in the future of professional translation is examined. This paper shows that machine translation will continue to be increasingly used whether translators like it or not.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

El proyecto Araknion tiene como objetivo general dotar al español y al catalán de una infraestructura básica de recursos lingüísticos para el procesamiento semántico de corpus en el marco de la Web 2.0 sean de origen oral o escrito.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents the automatic extension to other languages of TERSEO, a knowledge-based system for the recognition and normalization of temporal expressions originally developed for Spanish. TERSEO was first extended to English through the automatic translation of the temporal expressions. Then, an improved porting process was applied to Italian, where the automatic translation of the temporal expressions from English and from Spanish was combined with the extraction of new expressions from an Italian annotated corpus. Experimental results demonstrate how, while still adhering to the rule-based paradigm, the development of automatic rule translation procedures allowed us to minimize the effort required for porting to new languages. Relying on such procedures, and without any manual effort or previous knowledge of the target language, TERSEO recognizes and normalizes temporal expressions in Italian with good results (72% precision and 83% recall for recognition).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we present an automatic system for the extraction of syntactic semantic patterns applied to the development of multilingual processing tools. In order to achieve optimum methods for the automatic treatment of more than one language, we propose the use of syntactic semantic patterns. These patterns are formed by a verbal head and the main arguments, and they are aligned among languages. In this paper we present an automatic system for the extraction and alignment of syntactic semantic patterns from two manually annotated corpora, and evaluate the main linguistic problems that we must deal with in the alignment process.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There is no question nowadays as to the international and powerful status of English at a global scale and, consequently, as to its presence in non-English speaking countries at different levels. Linguistically speaking, English is one of the languages which have mostly influenced Spanish throughout its history and especially from the late 1960s. In this study, the impact of English on Spanish is considered in the language of sports; particularly, sports Anglicisms and false Anglicisms are analysed. Due attention is paid to the different forms that an Anglicism may adopt and to which of those forms are more widely accepted or rejected by prescriptivists and speakers at large, in the light of a contrastive analysis of their appearance in the Nuevo diccionario de anglicismos, the Diccionario de la Real Academia Española and the Corpus de Referencia del Español Actual.