868 resultados para traduzione automatica, machine translation, post-editing, pre-editing, workflow, LSP, TA, MT
Resumo:
In this paper we describe the methodology and the structural design of a system that translates English into Malayalam using statistical models. A monolingual Malayalam corpus and a bilingual English/Malayalam corpus are the main resource in building this Statistical Machine Translator. Training strategy adopted has been enhanced by PoS tagging which helps to get rid of the insignificant alignments. Moreover, incorporating units like suffix separator and the stop word eliminator has proven to be effective in bringing about better training results. In the decoder, order conversion rules are applied to reduce the structural difference between the language pair. The quality of statistical outcome of the decoder is further improved by applying mending rules. Experiments conducted on a sample corpus have generated reasonably good Malayalam translations and the results are verified with F measure, BLEU and WER evaluation metrics
Resumo:
A methodology for translating text from English into the Dravidian language, Malayalam using statistical models is discussed in this paper. The translator utilizes a monolingual Malayalam corpus and a bilingual English/Malayalam corpus in the training phase and generates automatically the Malayalam translation of an unseen English sentence. Various techniques to improve the alignment model by incorporating the morphological inputs into the bilingual corpus are discussed. Removing the insignificant alignments from the sentence pairs by this approach has ensured better training results. Pre-processing techniques like suffix separation from the Malayalam corpus and stop word elimination from the bilingual corpus also proved to be effective in producing better alignments. Difficulties in translation process that arise due to the structural difference between the English Malayalam pair is resolved in the decoding phase by applying the order conversion rules. The handcrafted rules designed for the suffix separation process which can be used as a guideline in implementing suffix separation in Malayalam language are also presented in this paper. Experiments conducted on a sample corpus have generated reasonably good Malayalam translations and the results are verified with F measure, BLEU and WER evaluation metrics
Resumo:
In Statistical Machine Translation from English to Malayalam, an unseen English sentence is translated into its equivalent Malayalam sentence using statistical models. A parallel corpus of English-Malayalam is used in the training phase. Word to word alignments has to be set among the sentence pairs of the source and target language before subjecting them for training. This paper deals with certain techniques which can be adopted for improving the alignment model of SMT. Methods to incorporate the parts of speech information into the bilingual corpus has resulted in eliminating many of the insignificant alignments. Also identifying the name entities and cognates present in the sentence pairs has proved to be advantageous while setting up the alignments. Presence of Malayalam words with predictable translations has also contributed in reducing the insignificant alignments. Moreover, reduction of the unwanted alignments has brought in better training results. Experiments conducted on a sample corpus have generated reasonably good Malayalam translations and the results are verified with F measure, BLEU and WER evaluation metrics.
Resumo:
Complex networks have been increasingly used in text analysis, including in connection with natural language processing tools, as important text features appear to be captured by the topology and dynamics of the networks. Following previous works that apply complex networks concepts to text quality measurement, summary evaluation, and author characterization, we now focus on machine translation (MT). In this paper we assess the possible representation of texts as complex networks to evaluate cross-linguistic issues inherent in manual and machine translation. We show that different quality translations generated by NIT tools can be distinguished from their manual counterparts by means of metrics such as in-(ID) and out-degrees (OD), clustering coefficient (CC), and shortest paths (SP). For instance, we demonstrate that the average OD in networks of automatic translations consistently exceeds the values obtained for manual ones, and that the CC values of source texts are not preserved for manual translations, but are for good automatic translations. This probably reflects the text rearrangements humans perform during manual translation. We envisage that such findings could lead to better NIT tools and automatic evaluation metrics.
Resumo:
The aim of this study was to identify the classic autopsy signs of drowning in post-mortem multislice computed tomography (MSCT). Therefore, the post-mortem pre-autopsy MSCT- findings of ten drowning cases were correlated with autopsy and statistically compared with the post-mortem MSCT of 20 non-drowning cases. Fluid in the airways was present in all drowning cases. Central aspiration in either the trachea or the main bronchi was usually observed. Consecutive bronchospasm caused emphysema aquosum. Sixty percent of drowning cases showed a mosaic pattern of the lung parenchyma due to regions of hypo- and hyperperfused lung areas of aspiration. The resorption of fresh water in the lung resulted in hypodensity of the blood representing haemodilution and possible heart failure. Swallowed water distended the stomach and duodenum; and inflow of water filled the paranasal sinuses (100%). All the typical findings of drowning, except Paltau's spots, were detected using post-mortem MSCT, and a good correlation of MSCT and autopsy was found. The advantage of MSCT was the direct detection of bronchospasm, haemodilution and water in the paranasal sinus, which is rather complicated or impossible at the classical autopsy.
Resumo:
The purpose of this study was to evaluate if pre-anaesthetic thoracic radiographs contribute to the anaesthetic management of trauma patients by comparing American Society of Anesthesiologists Physical Status Classification (ASA grade) with and without information from thoracic radiography findings. Case records of 157 dogs and cats being anaesthetized with or without post-traumatic, pre-anaesthetic chest radiographs were retrospectively evaluated for clinical parameters, radiographic abnormalities and anaesthetic protocol. Animals were retrospectively assigned an ASA grade. ASA grades, clinical signs of respiratory abnormalities and anaesthesia protocols were compared between animals with and without chest radiographs. The group of animals without pre-anaesthetic radiographs was anaesthetized earlier after trauma and showed less respiratory abnormalities at presentation. The retrospectively evaluated ASA grade significantly increased with the information from thoracic radiography. Animals with a higher ASA grade were less frequently mechanically ventilated. Pre-anaesthetic radiographs may provide important information to assess the ASA grade in traumatized patients and may therefore influence the anesthesia protocol.
Resumo:
Este artículo recorre la relación entre la experiencia en los oficios del libro (traducción, redacción, corrección) y las nuevas disciplinas académicas (traductología, historia del libro) que las transformaciones de los soportes materiales y financieros han suscitado.
Resumo:
La historia de la traducción de los textos escritos por viajeros extranjeros a la Argentina durante el siglo XIX está aún por hacerse. En este trabajo, exploramos las posibilidades de ese proyecto mediante el análisis del trabajo de cuatro traductores: Carlos Aldao, Juan Heller, Carlos Muzzio Sáenz Peña y José Luis Busaniche en la primera mitad del siglo XX. Además, ponemos en relación el trabajo realizado por esos traductores con las colecciones de libros que incorporaron a su catálogo las traducciones que aquéllos habían realizado. Como cierre, se evalúa cuál fue el destino de ellas en dos colecciones de libros más recientes: una, publicada en la década de 1980, y otra, contemporánea, cuyas entregas se iniciaron a fines de la década de 1990
Resumo:
Este artículo recorre la relación entre la experiencia en los oficios del libro (traducción, redacción, corrección) y las nuevas disciplinas académicas (traductología, historia del libro) que las transformaciones de los soportes materiales y financieros han suscitado.
Resumo:
La historia de la traducción de los textos escritos por viajeros extranjeros a la Argentina durante el siglo XIX está aún por hacerse. En este trabajo, exploramos las posibilidades de ese proyecto mediante el análisis del trabajo de cuatro traductores: Carlos Aldao, Juan Heller, Carlos Muzzio Sáenz Peña y José Luis Busaniche en la primera mitad del siglo XX. Además, ponemos en relación el trabajo realizado por esos traductores con las colecciones de libros que incorporaron a su catálogo las traducciones que aquéllos habían realizado. Como cierre, se evalúa cuál fue el destino de ellas en dos colecciones de libros más recientes: una, publicada en la década de 1980, y otra, contemporánea, cuyas entregas se iniciaron a fines de la década de 1990
Resumo:
Este artículo recorre la relación entre la experiencia en los oficios del libro (traducción, redacción, corrección) y las nuevas disciplinas académicas (traductología, historia del libro) que las transformaciones de los soportes materiales y financieros han suscitado.
Resumo:
La historia de la traducción de los textos escritos por viajeros extranjeros a la Argentina durante el siglo XIX está aún por hacerse. En este trabajo, exploramos las posibilidades de ese proyecto mediante el análisis del trabajo de cuatro traductores: Carlos Aldao, Juan Heller, Carlos Muzzio Sáenz Peña y José Luis Busaniche en la primera mitad del siglo XX. Además, ponemos en relación el trabajo realizado por esos traductores con las colecciones de libros que incorporaron a su catálogo las traducciones que aquéllos habían realizado. Como cierre, se evalúa cuál fue el destino de ellas en dos colecciones de libros más recientes: una, publicada en la década de 1980, y otra, contemporánea, cuyas entregas se iniciaron a fines de la década de 1990
Resumo:
Ontologies and taxonomies are widely used to organize concepts providing the basis for activities such as indexing, and as background knowledge for NLP tasks. As such, translation of these resources would prove useful to adapt these systems to new languages. However, we show that the nature of these resources is significantly different from the "free-text" paradigm used to train most statistical machine translation systems. In particular, we see significant differences in the linguistic nature of these resources and such resources have rich additional semantics. We demonstrate that as a result of these linguistic differences, standard SMT methods, in particular evaluation metrics, can produce poor performance. We then look to the task of leveraging these semantics for translation, which we approach in three ways: by adapting the translation system to the domain of the resource; by examining if semantics can help to predict the syntactic structure used in translation; and by evaluating if we can use existing translated taxonomies to disambiguate translations. We present some early results from these experiments, which shed light on the degree of success we may have with each approach
Resumo:
Semantic Web aims to allow machines to make inferences using the explicit conceptualisations contained in ontologies. By pointing to ontologies, Semantic Web-based applications are able to inter-operate and share common information easily. Nevertheless, multilingual semantic applications are still rare, owing to the fact that most online ontologies are monolingual in English. In order to solve this issue, techniques for ontology localisation and translation are needed. However, traditional machine translation is difficult to apply to ontologies, owing to the fact that ontology labels tend to be quite short in length and linguistically different from the free text paradigm. In this paper, we propose an approach to enhance machine translation of ontologies based on exploiting the well-structured concept descriptions contained in the ontology. In particular, our approach leverages the semantics contained in the ontology by using Cross Lingual Explicit Semantic Analysis (CLESA) for context-based disambiguation in phrase-based Statistical Machine Translation (SMT). The presented work is novel in the sense that application of CLESA in SMT has not been performed earlier to the best of our knowledge.
Resumo:
This paper describes the design, development and field evaluation of a machine translation system from Spanish to Spanish Sign Language (LSE: Lengua de Signos Española). The developed system focuses on helping Deaf people when they want to renew their Driver’s License. The system is made up of a speech recognizer (for decoding the spoken utterance into a word sequence), a natural language translator (for converting a word sequence into a sequence of signs belonging to the sign language), and a 3D avatar animation module (for playing back the signs). For the natural language translator, three technological approaches have been implemented and evaluated: an example-based strategy, a rule-based translation method and a statistical translator. For the final version, the implemented language translator combines all the alternatives into a hierarchical structure. This paper includes a detailed description of the field evaluation. This evaluation was carried out in the Local Traffic Office in Toledo involving real government employees and Deaf people. The evaluation includes objective measurements from the system and subjective information from questionnaires. The paper details the main problems found and a discussion on how to solve them (some of them specific for LSE).