989 resultados para Machine translation


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper investigates several approaches to bootstrapping a new spoken language understanding (SLU) component in a target language given a large dataset of semantically-annotated utterances in some other source language. The aim is to reduce the cost associated with porting a spoken dialogue system from one language to another by minimising the amount of data required in the target language. Since word-level semantic annotations are costly, Semantic Tuple Classifiers (STCs) are used in conjunction with statistical machine translation models both of which are trained from unaligned data to further reduce development time. The paper presents experiments in which a French SLU component in the tourist information domain is bootstrapped from English data. Results show that training STCs on automatically translated data produced the best performance for predicting the utterance's dialogue act type, however individual slot/value pairs are best predicted by training STCs on the source language and using them to decode translated utterances. © 2010 ISCA.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper introduces a rule-based classification of single-word and compound verbs into a statistical machine translation approach. By substituting verb forms by the lemma of their head verb, the data sparseness problem caused by highly-inflected languages can be successfully addressed. On the other hand, the information of seen verb forms can be used to generate new translations for unseen verb forms. Translation results for an English to Spanish task are reported, producing a significant performance improvement.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we describe MARIE, an Ngram-based statistical machine translation decoder. It is implemented using a beam search strategy, with distortion (or reordering) capabilities. The underlying translation model is based on an Ngram approach, extended to introduce reordering at the phrase level. The search graph structure is designed to perform very accurate comparisons, what allows for a high level of pruning, improving the decoder efficiency. We report several techniques for efficiently prune out the search space. The combinatory explosion of the search space derived from the search graph structure is reduced by limiting the number of reorderings a given translation is allowed to perform, and also the maximum distance a word (or a phrase) is allowed to be reordered. We finally report translation accuracy results on three different translation tasks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper a method to incorporate linguistic information regarding single-word and compound verbs is proposed, as a first step towards an SMT model based on linguistically-classified phrases. By substituting these verb structures by the base form of the head verb, we achieve a better statistical word alignment performance, and are able to better estimate the translation model and generalize to unseen verb forms during translation. Preliminary experiments for the English - Spanish language pair are performed, and future research lines are detailed. © 2005 Association for Computational Linguistics.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We report an empirical study of n-gram posterior probability confidence measures for statistical machine translation (SMT). We first describe an efficient and practical algorithm for rapidly computing n-gram posterior probabilities from large translation word lattices. These probabilities are shown to be a good predictor of whether or not the n-gram is found in human reference translations, motivating their use as a confidence measure for SMT. Comprehensive n-gram precision and word coverage measurements are presented for a variety of different language pairs, domains and conditions. We analyze the effect on reference precision of using single or multiple references, and compare the precision of posteriors computed from k-best lists to those computed over the full evidence space of the lattice. We also demonstrate improved confidence by combining multiple lattices in a multi-source translation framework. © 2012 The Author(s).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose a new formally syntax-based method for statistical machine translation. Transductions between parsing trees are transformed into a problem of sequence tagging, which is then tackled by a search- based structured prediction method. This allows us to automatically acquire transla- tion knowledge from a parallel corpus without the need of complex linguistic parsing. This method can achieve compa- rable results with phrase-based method (like Pharaoh), however, only about ten percent number of translation table is used. Experiments show that the structured pre- diction approach for SMT is promising for its strong ability at combining words.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Jones, E. H. G., Thomas, Ned, and King, Alan, 'Machine Translation and the Internet', Mercator Media Forums (2001) 5(1) pp.84-98 RAE2008

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The present dissertation examines how grammatical aspect and mood are handled by machine translation (MT) systems within the scope of imperative sentences (orders, recommendations) when dealing with the language pair French-Greek (unidirectional, towards Greek). As the grammatical category of aspect is not expressed in the same way in both languages, choosing the correct aspect value when translating a verb from French to Greek can pose problems. We are interested in describing the types of errors that occur and their frequency in a corpus taken from texts pertaining to the security domain and from technical manuals, where imperative sentences are very common. In order to further delimit our research, our sample consists of sentences that comply with the general principles of simplicity and readability provided by several controlled language guidelines and aimed at higher translatability when having MT in mind. In a second phase, this study aims at discovering how modifying some of the control rules would help (or not) the MT systems better decide upon the translation of aspect and mood.