9 resultados para Machine translation system
em Universidad de Alicante
Resumo:
Statistical machine translation (SMT) is an approach to Machine Translation (MT) that uses statistical models whose parameter estimation is based on the analysis of existing human translations (contained in bilingual corpora). From a translation student’s standpoint, this dissertation aims to explain how a phrase-based SMT system works, to determine the role of the statistical models it uses in the translation process and to assess the quality of the translations provided that system is trained with in-domain goodquality corpora. To that end, a phrase-based SMT system based on Moses has been trained and subsequently used for the English to Spanish translation of two texts related in topic to the training data. Finally, the quality of this output texts produced by the system has been assessed through a quantitative evaluation carried out with three different automatic evaluation measures and a qualitative evaluation based on the Multidimensional Quality Metrics (MQM).
Resumo:
In the last few years, there has been a wide development in the research on textual information systems. The goal is to improve these systems in order to allow an easy localization, treatment and access to the information stored in digital format (Digital Databases, Documental Databases, and so on). There are lots of applications focused on information access (for example, Web-search systems like Google or Altavista). However, these applications have problems when they must access to cross-language information, or when they need to show information in a language different from the one of the query. This paper explores the use of syntactic-sematic patterns as a method to access to multilingual information, and revise, in the case of Information Retrieval, where it is possible and useful to employ patterns when it comes to the multilingual and interactive aspects. On the one hand, the multilingual aspects that are going to be studied are the ones related to the access to documents in different languages from the one of the query, as well as the automatic translation of the document, i.e. a machine translation system based on patterns. On the other hand, this paper is going to go deep into the interactive aspects related to the reformulation of a query based on the syntactic-semantic pattern of the request.
Resumo:
Comunicación presentada en Cross-Language Evaluation Forum (CLEF 2008), Aarhus, Denmark, September 17-19, 2008.
Resumo:
This paper tells about the recognition of temporal expressions and the resolution of their temporal reference. A proposal of the units we have used to face up this tasks over a restricted domain is shown. We work with newspapers' articles in Spanish, that is why every reference we use is in Spanish. For the identification and recognition of temporal expressions we base on a temporal expression grammar and for the resolution on a dictionary, where we have the information necessary to do the date operation based on the recognized expressions. In the evaluation of our proposal we have obtained successful results for the examples studied.
Resumo:
El campo de procesamiento de lenguaje natural (PLN), ha tenido un gran crecimiento en los últimos años; sus áreas de investigación incluyen: recuperación y extracción de información, minería de datos, traducción automática, sistemas de búsquedas de respuestas, generación de resúmenes automáticos, análisis de sentimientos, entre otras. En este artículo se presentan conceptos y algunas herramientas con el fin de contribuir al entendimiento del procesamiento de texto con técnicas de PLN, con el propósito de extraer información relevante que pueda ser usada en un gran rango de aplicaciones. Se pueden desarrollar clasificadores automáticos que permitan categorizar documentos y recomendar etiquetas; estos clasificadores deben ser independientes de la plataforma, fácilmente personalizables para poder ser integrados en diferentes proyectos y que sean capaces de aprender a partir de ejemplos. En el presente artículo se introducen estos algoritmos de clasificación, se analizan algunas herramientas de código abierto disponibles actualmente para llevar a cabo estas tareas y se comparan diversas implementaciones utilizando la métrica F en la evaluación de los clasificadores.
imaxin|software: PLN aplicada a la mejora de la comunicación multilingüe de empresas e instituciones
Resumo:
imaxin|software es una empresa creada en 1997 por cuatro titulados en ingeniería informática cuyo objetivo ha sido el de desarrollar videojuegos multimedia educativos y procesamiento del lenguaje natural multilingüe. 17 años más tarde, hemos desarrollado recursos, herramientas y aplicaciones multilingües de referencia para diferentes lenguas: Portugués (Galicia, Portugal, Brasil, etc.), Español (España, Argentina, México, etc.), Inglés, Catalán y Francés. En este artículo haremos una descripción de aquellos principales hitos en relación a la incorporación de estas tecnologías PLN al sector industrial e institucional.
Resumo:
Contiene: Jussawalla, Feroza (2003): Chiffon Saris. Toronto: TSAR Publications, 92 pages / Reviewed by Silvia Caporale Bizzini; Fernández Álvarez; M. Pilar and Antón Teodoro Manrique (2002): Antología de la literatura nórdica antigua. Salamanca: Ediciones Universidad / Reviewed by José R. Belda; Schwarlz, Anja (2001). The (im)possibilities of machine translation. Peter Lang. Frankfurt am Main. 323 pages / Reviewed by Silvia Borrás Giner; Terttu Nevalainen and Helena Raumolin-Brunberg (2003): Historical Sociolinguistics: Language Change in Tudor and Stuart England. Great Britain: Pearson Education, 260pages / Reviewed by Sara Ponce Serrano.Contiene: Jussawalla, Feroza (2003): Chiffon Saris. Toronto: TSAR Publications, 92 pages / Reviewed by Silvia Caporale Bizzini; Fernández Álvarez; M. Pilar and Antón Teodoro Manrique (2002): Antología de la literatura nórdica antigua. Salamanca: Ediciones Universidad / Reviewed by José R. Belda; Schwarlz, Anja (2001). The (im)possibilities of machine translation. Peter Lang. Frankfurt am Main. 323 pages / Reviewed by Silvia Borrás Giner; Terttu Nevalainen and Helena Raumolin-Brunberg (2003): Historical Sociolinguistics: Language Change in Tudor and Stuart England. Great Britain: Pearson Education, 260pages / Reviewed by Sara Ponce Serrano.
Resumo:
The extension to new languages is a well known bottleneck for rule-based systems. Considerable human effort, which typically consists in re-writing from scratch huge amounts of rules, is in fact required to transfer the knowledge available to the system from one language to a new one. Provided sufficient annotated data, machine learning algorithms allow to minimize the costs of such knowledge transfer but, up to date, proved to be ineffective for some specific tasks. Among these, the recognition and normalization of temporal expressions still remains out of their reach. Focusing on this task, and still adhering to the rule-based framework, this paper presents a bunch of experiments on the automatic porting to Italian of a system originally developed for Spanish. Different automatic rule translation strategies are evaluated and discussed, providing a comprehensive overview of the challenge.
Resumo:
The green Cu-NirK from Haloferax mediterranei (Cu-NirK) has been expressed, refolded and retrieved as a trimeric enzyme using an expression method developed for halophilic Archaea. This method utilizes Haloferax volcanii as a halophilic host and an expression vector with a constitutive and strong promoter. The enzymatic activity of recombinant Cu-NirK was detected in both cellular fractions (cytoplasmic fraction and membranes) and in the culture media. The characterization of the enzyme isolated from the cytoplasmic fraction as well as the culture media revealed important differences in the primary structure of both forms indicating that Hfx. mediterranei could carry out a maturation and exportation process within the cell before the protein is exported to the S-layer. Several conserved signals found in Cu-NirK from Hfx. mediterranei sequence indicate that these processes are closely related to the Tat system. Furthermore, the N-terminal sequence of the two Cu-NirK subunits constituting different isoforms revealed that translation of this protein could begin at two different points, identifying two possible start codons. The hypothesis proposed in this work for halophilic Cu-NirK processing and exportation via the Tat system represents the first approximation of this mechanism in the Halobacteriaceae family and in Prokarya in general.