870 resultados para parallel corpora


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Following the internationalization of contemporary higher education, academic institutions based in non-English speaking countries are increasingly urged to produce contents in English to address international prospective students and personnel, as well as to increase their attractiveness. The demand for English translations in the institutional academic domain is consequently increasing at a rate exceeding the capacity of the translation profession. Resources for assisting non-native authors and translators in the production of appropriate texts in L2 are therefore required in order to help academic institutions and professionals streamline their translation workload. Some of these resources include: (i) parallel corpora to train machine translation systems and multilingual authoring tools; and (ii) translation memories for computer-aided tools. The purpose of this study is to create and evaluate reference resources like the ones mentioned in (i) and (ii) through the automatic sentence alignment of a large set of Italian and English as a Lingua Franca (ELF) institutional academic texts given as equivalent but not necessarily parallel (i.e. translated). In this framework, a set of aligning algorithms and alignment tools is examined in order to identify the most profitable one(s) in terms of accuracy and time- and cost-effectiveness. In order to determine the text pairs to align, a sample is selected according to document length similarity (characters) and subsequently evaluated in terms of extent of noisiness/parallelism, alignment accuracy and content leverageability. The results of these analyses serve as the basis for the creation of an aligned bilingual corpus of academic course descriptions, which is eventually used to create a translation memory in TMX format.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

L'obiettivo della tesi è la compilazione del glossario culinario italiano-russo che “racchiudere” termini culinari artusiani e propone una versione russa basandosi anche sulla traduzione parziale del libro in lingua russa. La tesi si divide in sette parti: introduzione, capitoli primo, secondo, terzo e quarto, conclusione e bibliografia. Il primo capitolo introduce la figura di Pellegrino Artusi con brevi cenni sulla sua vita e tratteggia, altresì, le peripezie ed il successo internazionale della sua opera ed il suo approdo in Russia. Il secondo capitolo è dedicato alla ricerca terminologica e alle fasi propedeutiche alla creazione del glossario. Inoltre, vengono spiegate le risorse usate per la creazione dei corpora. Avendo a disposizione la traduzione parziale de “La scienza in cucina e l'arte di mangiar bene” in russo (traduzione di I. Alekberova) fornita dalla Casa Artusi, si cerca di spiegare la scelta dei termini italiani messi a confronto con quelli esistenti nella traduzione russa. Il terzo capitolo introduce il glossario stesso preceduto da una breve spiegazione. Ogni “entrata” contiene il termine, la sua categoria grammaticale e la sua definizione in entrambe le lingue, seguita nella maggior parte dei casi dalle collocazioni o dagli esempi d'uso oppure dai sinonimi. Il quarto capitolo presenta commenti alla compilazione del glossario. Qui vengono analizzati i problemi riscontrati durante la fase compilativa, si presentano le soluzioni trovate e si forniscono esempi concreti. Ci sono anche commenti alle voci non presenti nel glossario. Infine, vi è una breve conclusione del percorso affrontato seguita dalla bibliografia e dalla sitografia. ENGLISH The purpose of this dissertation is to present a bilingual Italian-Russian glossary based on the culinary terms drawn from Artusi's cooking book "The Science of Cooking and the Art of Fine dining". The dissertation consists of an introduction, 4 chapters, conclusions and a list of bibliography. An introduction presents an overview of the entire dissertation. The first chapter includes a presentation of Pellegrino Artusi, brief introduction to his life, his book and the success it has had around the world and mainly in Russia. The second chapter focuses on the creation and use of comparable and parallel corpora that have been created ad hoc for the purpose of the glossary. It also describes the different programs that have been used in order to select the terminology. The third chapter presents the structure of the bilingual culinary glossary followed by the glossary itself. Each entry contains the term, its gramatical category and the definition in both languages followed by, in most but not all cases, collocation, synonyms and additional notes. The fourth chapter presents an analysis of the compilation of the glossary combined with comments and examples. This chapter is followed by final conclusions of the present dissertation. The last part contains a bibliography that includes all the resources that have been used for the completion of this dissertation followed by the webliography.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Europarl is a large multilingual corpus containing the minutes of the debates at the European Parliament. This article presents a method to extract different corpora from Europarl: monolingual and multilingual comparable corpora, as well as parallel corpora. Using state-of-the-art measures of homogeneity, we show that these corpora are very similar. In addition, we argue that they present many advantages for research in various fields of linguistics and translation studies, and we also discuss some of their limitations. We conclude by reviewing a number of previous studies that made use of these corpora, emphasizing in each case the possibilities offered by Europarl.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The various meanings of discourse connectives like while and however are difficult to identify and annotate, even for trained human annotators. This problem is all the more important that connectives are salient textual markers of cohesion and need to be correctly interpreted for many NLP applications. In this paper, we suggest an alternative route to reach a reliable annotation of connectives, by making use of the information provided by their translation in large parallel corpora. This method thus replaces the difficult explicit reasoning involved in traditional sense annotation by an empirical clustering of the senses emerging from the translations. We argue that this method has the advantage of providing more reliable reference data than traditional sense annotation. In addition, its simplicity allows for the rapid constitution of large annotated datasets.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

El intérprete de conferencias debe llevar a cabo un trabajo documental antes, durante y después de los eventos en los que presta sus servicios, independientemente de su subcompetencia extralingüística. Desafortunadamente, pocas son las propuestas metodológicas que se hayan planteado para que este profesional pueda realizar esta tarea de manera sistemática. En el presente artículo, repasamos algunos de los trabajos que se han referido a las posibilidades que tiene el intérprete de satisfacer sus necesidades informativas. Una vez reseñada la mencionada escasez de propuestas, presentamos, en un estudio de caso, una aproximación metodológica a este trabajo de documentación, fundamentada en la compilación de corpus paralelos ad hoc y la extracción terminológica en forma de glosarios.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Pós-graduação em Estudos Linguísticos - IBILCE

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Software corpora facilitate reproducibility of analyses, however, static analysis for an entire corpus still requires considerable effort, often duplicated unnecessarily by multiple users. Moreover, most corpora are designed for single languages increasing the effort for cross-language analysis. To address these aspects we propose Pangea, an infrastructure allowing fast development of static analyses on multi-language corpora. Pangea uses language-independent meta-models stored as object model snapshots that can be directly loaded into memory and queried without any parsing overhead. To reduce the effort of performing static analyses, Pangea provides out-of-the box support for: creating and refining analyses in a dedicated environment, deploying an analysis on an entire corpus, using a runner that supports parallel execution, and exporting results in various formats. In this tool demonstration we introduce Pangea and provide several usage scenarios that illustrate how it reduces the cost of analysis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Prince Maximilian zu Wied's great exploration of coastal Brazil in 1815-1817 resulted in important collections of reptiles, amphibians, birds, and mammals, many of which were new species later described by Wied himself The bulk of his collection was purchased for the American Museum of Natural History in 1869, although many ""type specimens"" had disappeared earlier. Wied carefully identified his localities but did not designate type specimens or type localities, which are taxonomic concepts that were not yet established. Information and manuscript names on a fraction (17 species) of his Brazilian reptiles and amphibians were transmitted by Wied to Prof. Heinrich Rudolf Schinz at the University of Zurich. Schinz included these species (credited to their discoverer ""Princ. Max."") in the second volume of Das Thierreich ... (1822). Most are junior objective synonyms of names published by Wied. However, six of the 17 names used by Schinz predate Wied's own publications. Three were manuscript names never published by Wied because he determined the species to be previously known. (1) Lacerta vittata Schinz, 1822 (a nomen oblitum) = Lacerta striata sensu Wied (a misidentification, non Linnaeus nec sensu Merrem) = Kentropyx calcarata Spix, 1825, herein qualified as a nomen protectum. (2) Polychrus virescens Schinz, 1822 = Lacerta marmorata Linnaeus, 1758 (now Polychrus marmoratus). (3) Scincus cyanurus Schinz, 1822 (a nomen oblitum) = Gymnophthalmus quadrilineatus sensu Wied (a misidentification, non Linnaeus nec sensu Merrem) = Micrablepharus maximiliani (Reinhardt and Lutken, ""1861"" [1862]), herein qualified as a nomen protectum. Qualifying Scincus cyanurus Schinz, 1822, as a nomen oblitum also removes the problem of homonymy with the later-named Pacific skink Scincus cyanurus Lesson (= Emoia cyanura). The remaining three names used by Schinz are senior objective synonyms that take priority over Wied's names. (4) Bufo cinctus Schinz, 1822, is senior to Bufo cinctus Wied, 1823; both, however, are junior synonyms of Bufo crucifer Wied, 1821 = Chaunus crucifer (Wied). (5) Agama picta Schinz, 1822, is senior to Agama picta Wied, 1823, requiring a change of authorship for this poorly known species, to be known as Enyalius pictus (Schinz). (6) Lacerta cyanomelas Schinz, 1822, predates Teius cyanomelas Wied, 1824 (1822-1831) both nomina oblita. Wied's illustration and description shows cyanomelas as apparently conspecific with the recently described but already well-known Cnemidophorus nativo Rocha et al., 1997, which is the valid name because of its qualification herein as a nomen protectum. The preceding specific name cyanomelas (as corrected in an errata section) is misspelled several ways in different copies of Schinz's original description (""cyanom las,"" ""cyanomlas,"" and cyanom""). Loosening, separation, and final loss of the last three letters of movable type in the printing chase probably accounts for the variant misspellings.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, artificial neural networks are employed in a novel approach to identify harmonic components of single-phase nonlinear load currents, whose amplitude and phase angle are subject to unpredictable changes, even in steady-state. The first six harmonic current components are identified through the variation analysis of waveform characteristics. The effectiveness of this method is tested by applying it to the model of a single-phase active power filter, dedicated to the selective compensation of harmonic current drained by an AC controller. Simulation and experimental results are presented to validate the proposed approach. (C) 2010 Elsevier B. V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper discusses the integrated design of parallel manipulators, which exhibit varying dynamics. This characteristic affects the machine stability and performance. The design methodology consists of four main steps: (i) the system modeling using flexible multibody technique, (ii) the synthesis of reduced-order models suitable for control design, (iii) the systematic flexible model-based input signal design, and (iv) the evaluation of some possible machine designs. The novelty in this methodology is to take structural flexibilities into consideration during the input signal design; therefore, enhancing the standard design process which mainly considers rigid bodies dynamics. The potential of the proposed strategy is exploited for the design evaluation of a two degree-of-freedom high-speed parallel manipulator. The results are experimentally validated. (C) 2010 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of this work is to verify the possibility to correlating specific gravity and wood hardness parallel and perpendicular to the grain. The purpose is to offer one more tool to help in the decision about wood species choice for use in floors and sleepers. To reach this intent, we considered the results of standard tests (NBR 7190:1997, Timber Structures Design, Annex B, Brazilian Association of Technical Standards) to determine hardness parallel and normal to the grain in fourteen tropical high density wood species (over 850 kg/m(3), at 12% moisture content). For each species twelve determinations were made, based on the material obtained at Sao Carlos and its regional wood market. Statistical analysis led to some expressions to describe the cited properties relationships, with a determination coefficient about 0.8.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper addresses the use of optimization techniques in the design of a steel riser. Two methods are used: the genetic algorithm, which imitates the process of natural selection, and the simulated annealing, which is based on the process of annealing of a metal. Both of them are capable of searching a given solution space for the best feasible riser configuration according to predefined criteria. Optimization issues are discussed, such as problem codification, parameter selection, definition of objective function, and restrictions. A comparison between the results obtained for economic and structural objective functions is made for a case study. Optimization method parallelization is also addressed. [DOI: 10.1115/1.4001955]