Biblioteca Digital

916 resultados para Translation termination

Translation Induction on Indian Language Corpora Using Translingual Themes from Other Languages

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Identifying translations from comparable corpora is a well-known problem with several applications, e.g. dictionary creation in resource-scarce languages. Scarcity of high quality corpora, especially in Indian languages, makes this problem hard, e.g. state-of-the-art techniques achieve a mean reciprocal rank (MRR) of 0.66 for English-Italian, and a mere 0.187 for Telugu-Kannada. There exist comparable corpora in many Indian languages with other ``auxiliary'' languages. We observe that translations have many topically related words in common in the auxiliary language. To model this, we define the notion of a translingual theme, a set of topically related words from auxiliary language corpora, and present a probabilistic framework for translation induction. Extensive experiments on 35 comparable corpora using English and French as auxiliary languages show that this approach can yield dramatic improvements in performance (e.g. MRR improves by 124% to 0.419 for Telugu-Kannada). A user study on WikiTSu, a system for cross-lingual Wikipedia title suggestion that uses our approach, shows a 20% improvement in the quality of titles suggested.

Improving speech transcription for Mandarin-English translation

Relevância:

20.00% 20.00%

Publicador:

Evaluation of termination techniques for 4H-SiC PiN diodes and trench JFETs

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An investigation concerning suitable termination techniques for 4H-SiC trench JFETs is presented. Field plates, p+ floating rings and junction termination extension techniques are used to terminate 1.2kV class PiN diodes. The fabricated PiN diodes evaluated here have a similar design to trench JFETs. Therefore, the conclusions for PiN diodes can be applied to JFET structures as well. Numerical simulations are also used to illustrate the effect of the terminations on the diodes' blocking mode behaviour.

The mechanism of the sudden termination of carbon nanotube supergrowth

Relevância:

20.00% 20.00%

Publicador:

Discriminative language model adaptation for Mandarin broadcast speech transcription and translation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper investigates unsupervised test-time adaptation of language models (LM) using discriminative methods for a Mandarin broadcast speech transcription and translation task. A standard approach to adapt interpolated language models to is to optimize the component weights by minimizing the perplexity on supervision data. This is a widely made approximation for language modeling in automatic speech recognition (ASR) systems. For speech translation tasks, it is unclear whether a strong correlation still exists between perplexity and various forms of error cost functions in recognition and translation stages. The proposed minimum Bayes risk (MBR) based approach provides a flexible framework for unsupervised LM adaptation. It generalizes to a variety of forms of recognition and translation error metrics. LM adaptation is performed at the audio document level using either the character error rate (CER), or translation edit rate (TER) as the cost function. An efficient parameter estimation scheme using the extended Baum-Welch (EBW) algorithm is proposed. Experimental results on a state-of-the-art speech recognition and translation system are presented. The MBR adapted language models gave the best recognition and translation performance and reduced the TER score by up to 0.54% absolute. © 2007 IEEE.

Improving speech transcription for Mandarin-english translation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes the development of the CU-HTK Mandarin Speech-To-Text (STT) system and assesses its performance as part of a transcription-translation pipeline which converts broadcast Mandarin audio into English text. Recent improvements to the STT system are described and these give Character Error Rate (CER) gains of 14.3% absolute for a Broadcast Conversation (BC) task and 5.1% absolute for a Broadcast News (BN) task. The output of these STT systems is then post-processed, so that it consists of sentence-like segments, and translated into English text using a Statistical Machine Translation (SMT) system. The performance of the transcription-translation pipeline is evaluated using the Translation Edit Rate (TER) and BLEU metrics. It is shown that improving both the STT system and the post-STT segmentations can lower the TER scores by up to 5.3% absolute and increase the BLEU scores by up to 2.7% absolute. © 2007 IEEE.

Efficient path counting transducers for minimum Bayes-risk decoding of statistical machine translation lattices

Relevância:

20.00% 20.00%

Publicador:

Fluency constraints for minimum Bayes-risk decoding of statistical machine translation lattices

Relevância:

20.00% 20.00%

Publicador:

The CUED HiFST system for the WMT10 translation shared task

Relevância:

20.00% 20.00%

Publicador:

Context-dependent alignment models for statistical machine translation

Relevância:

20.00% 20.00%

Publicador:

Hierarchical phrase-based translation with weighted finite state transducers

Relevância:

20.00% 20.00%

Publicador:

Minimum Bayes risk combination of translation hypotheses from alternative morphological decompositions

Relevância:

20.00% 20.00%

Publicador:

Rule filtering by pattern for efficient hierarchical translation

Relevância:

20.00% 20.00%

Publicador:

Large-scale statistical machine translation with weighted finite state transducers

Relevância:

20.00% 20.00%

Publicador:

European language translation with weighted finite state transducers: the CUED MT system

Relevância:

20.00% 20.00%

Publicador:

«
1
2
...
5
6
7
8
9
10
11
...
61
62
»