920 resultados para Newspaper translation


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper investigates unsupervised test-time adaptation of language models (LM) using discriminative methods for a Mandarin broadcast speech transcription and translation task. A standard approach to adapt interpolated language models to is to optimize the component weights by minimizing the perplexity on supervision data. This is a widely made approximation for language modeling in automatic speech recognition (ASR) systems. For speech translation tasks, it is unclear whether a strong correlation still exists between perplexity and various forms of error cost functions in recognition and translation stages. The proposed minimum Bayes risk (MBR) based approach provides a flexible framework for unsupervised LM adaptation. It generalizes to a variety of forms of recognition and translation error metrics. LM adaptation is performed at the audio document level using either the character error rate (CER), or translation edit rate (TER) as the cost function. An efficient parameter estimation scheme using the extended Baum-Welch (EBW) algorithm is proposed. Experimental results on a state-of-the-art speech recognition and translation system are presented. The MBR adapted language models gave the best recognition and translation performance and reduced the TER score by up to 0.54% absolute. © 2007 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes the development of the CU-HTK Mandarin Speech-To-Text (STT) system and assesses its performance as part of a transcription-translation pipeline which converts broadcast Mandarin audio into English text. Recent improvements to the STT system are described and these give Character Error Rate (CER) gains of 14.3% absolute for a Broadcast Conversation (BC) task and 5.1% absolute for a Broadcast News (BN) task. The output of these STT systems is then post-processed, so that it consists of sentence-like segments, and translated into English text using a Statistical Machine Translation (SMT) system. The performance of the transcription-translation pipeline is evaluated using the Translation Edit Rate (TER) and BLEU metrics. It is shown that improving both the STT system and the post-STT segmentations can lower the TER scores by up to 5.3% absolute and increase the BLEU scores by up to 2.7% absolute. © 2007 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Resumen: Empleando la teoría de la “estructura comunitaria”, un muestreo de diarios principales en 28 ciudades grandes en Estados Unidos examina la cobertura del tema “El manejo de contaminación de agua y acceso a agua potable”. Mediante el análisis de todos los artículos de más de 250 palabras publicados a través de diez años entre 01/01/2001 y 01/01/2011 (339 artículos), se compararon sistemáticamente características comunitarias y el “Vector Mediático” de Pollock (combinando en un valor dos medidas de contenido: la “prominencia” de un artículo en un periódico con la orientación o tono). Cobertura “favorable”, que apoya la mayor ayuda gubernamental para mejorar el abastecimiento de agua potable, fue vinculada con medidas de “los interesados”, por ejemplo, con el porcentaje de hispanos (r de Pearson = .349, p = .04). El análisis de las medidas y su regresión reveló dos medidas significativas asociadas con apoyo para manejo gubernamental por agua potable: porcentaje de hispanos (12.2% de la varianza), y con porcentaje de ciudadanos de 18-24 años, 16.7%. Inesperadamente, la cobertura de manejo gubernamental para mejorar las existencias de agua potable no fue vinculado ni con medidas de “vulnerabilidad” (pobreza, desempleo) ni con medidas de “estabilidad” (educación, ingreso).