641 resultados para COLEGIO SAN GREGORIO HERNÁNDEZ – CONTABILIDAD - BARRIO SANTA LIBRADA, LOCALIDAD DE USME (BOGOTÁ, COLOMBIA)
Resumo:
In this paper, we describe a complete development platform that features different innovative acceleration strategies, not included in any other current platform, that simplify and speed up the definition of the different elements required to design a spoken dialog service. The proposed accelerations are mainly based on using the information from the backend database schema and contents, as well as cumulative information produced throughout the different steps in the design. Thanks to these accelerations, the interaction between the designer and the platform is improved, and in most cases the design is reduced to simple confirmations of the “proposals” that the platform dynamically provides at each step. In addition, the platform provides several other accelerations such as configurable templates that can be used to define the different tasks in the service or the dialogs to obtain or show information to the user, automatic proposals for the best way to request slot contents from the user (i.e. using mixed-initiative forms or directed forms), an assistant that offers the set of more probable actions required to complete the definition of the different tasks in the application, or another assistant for solving specific modality details such as confirmations of user answers or how to present them the lists of retrieved results after querying the backend database. Additionally, the platform also allows the creation of speech grammars and prompts, database access functions, and the possibility of using mixed initiative and over-answering dialogs. In the paper we also describe in detail each assistant in the platform, emphasizing the different kind of methodologies followed to facilitate the design process at each one. Finally, we describe the results obtained in both a subjective and an objective evaluation with different designers that confirm the viability, usefulness, and functionality of the proposed accelerations. Thanks to the accelerations, the design time is reduced in more than 56% and the number of keystrokes by 84%.
Resumo:
This paper proposes the use of Factored Translation Models (FTMs) for improving a Speech into Sign Language Translation System. These FTMs allow incorporating syntactic-semantic information during the translation process. This new information permits to reduce significantly the translation error rate. This paper also analyses different alternatives for dealing with the non-relevant words. The speech into sign language translation system has been developed and evaluated in a specific application domain: the renewal of Identity Documents and Driver’s License. The translation system uses a phrase-based translation system (Moses). The evaluation results reveal that the BLEU (BiLingual Evaluation Understudy) has improved from 69.1% to 73.9% and the mSER (multiple references Sign Error Rate) has been reduced from 30.6% to 24.8%.
Resumo:
This paper describes a categorization module for improving the performance of a Spanish into Spanish Sign Language (LSE) translation system. This categorization module replaces Spanish words with associated tags. When implementing this module, several alternatives for dealing with non-relevant words have been studied. Non-relevant words are Spanish words not relevant in the translation process. The categorization module has been incorporated into a phrase-based system and a Statistical Finite State Transducer (SFST). The evaluation results reveal that the BLEU has increased from 69.11% to 78.79% for the phrase-based system and from 69.84% to 75.59% for the SFST.
Resumo:
This paper describes the UPM system for translation task at the EMNLP 2011 workshop on statistical machine translation (http://www.statmt.org/wmt11/), and it has been used for both directions: Spanish-English and English-Spanish. This system is based on Moses with two new modules for pre and post processing the sentences. The main contribution is the method proposed (based on the similarity with the source language test set) for selecting the sentences for training the models and adjusting the weights. With system, we have obtained a 23.2 BLEU for Spanish-English and 21.7 BLEU for EnglishSpanish
Resumo:
This paper describes the design, development and field evaluation of a machine translation system from Spanish to Spanish Sign Language (LSE: Lengua de Signos Española). The developed system focuses on helping Deaf people when they want to renew their Driver’s License. The system is made up of a speech recognizer (for decoding the spoken utterance into a word sequence), a natural language translator (for converting a word sequence into a sequence of signs belonging to the sign language), and a 3D avatar animation module (for playing back the signs). For the natural language translator, three technological approaches have been implemented and evaluated: an example-based strategy, a rule-based translation method and a statistical translator. For the final version, the implemented language translator combines all the alternatives into a hierarchical structure. This paper includes a detailed description of the field evaluation. This evaluation was carried out in the Local Traffic Office in Toledo involving real government employees and Deaf people. The evaluation includes objective measurements from the system and subjective information from questionnaires. The paper details the main problems found and a discussion on how to solve them (some of them specific for LSE).
Resumo:
Este artículo describe una estrategia de selección de frases para hacer el ajuste de un sistema de traducción estadístico basado en el decodificador Moses que traduce del español al inglés. En este trabajo proponemos dos posibilidades para realizar esta selección de las frases del corpus de validación que más se parecen a las frases que queremos traducir (frases de test en lengua origen). Con esta selección podemos obtener unos mejores pesos de los modelos para emplearlos después en el proceso de traducción y, por tanto, mejorar los resultados. Concretamente, con el método de selección basado en la medida de similitud propuesta en este artículo, mejoramos la medida BLEU del 27,17% con el corpus de validación completo al 27,27% seleccionando las frases para el ajuste. Estos resultados se acercan a los del experimento ORACLE: se utilizan las mismas frases de test para hacer el ajuste de los pesos. En este caso, el BLEU obtenido es de 27,51%.
Resumo:
This paper proposes an architecture, based on statistical machine translation, for developing the text normalization module of a text to speech conversion system. The main target is to generate a language independent text normalization module, based on data and flexible enough to deal with all situa-tions presented in this task. The proposed architecture is composed by three main modules: a tokenizer module for splitting the text input into a token graph (tokenization), a phrase-based translation module (token translation) and a post-processing module for removing some tokens. This paper presents initial exper-iments for numbers and abbreviations. The very good results obtained validate the proposed architecture.
Resumo:
This paper presents an automatic strategy to decide how to pronounce a Capital Letter Sequence (CLS) in a Text to Speech system (TTS). If CLS is well known by the TTS, it can be expanded in several words. But when the CLS is unknown, the system has two alternatives: spelling it (abbreviation) or pronouncing it as a new word (acronym). In Spanish, there is a high relationship between letters and phonemes. Because of this, when a CLS is similar to other words in Spanish, there is a high tendency to pronounce it as a standard word. This paper proposes an automatic method for detecting acronyms. Additionaly, this paper analyses the discrimination capability of some features, and several strategies for combining them in order to obtain the best classifier. For the best classifier, the classification error is 8.45%. About the feature analysis, the best features have been the Letter Sequence Perplexity and the Average N-gram order.
Resumo:
This paper describes the UPM system for the Spanish-English translation task at the NAACL 2012 workshop on statistical machine translation. This system is based on Moses. We have used all available free corpora, cleaning and deleting some repetitions. In this paper, we also propose a technique for selecting the sentences for tuning the system. This technique is based on the similarity with the sentences to translate. With our approach, we improve the BLEU score from 28.37% to 28.57%. And as a result of the WMT12 challenge we have obtained a 31.80% BLEU with the 2012 test set. Finally, we explain different experiments that we have carried out after the competition.
Resumo:
In this paper, we describe new results and improvements to a lan-guage identification (LID) system based on PPRLM previously introduced in [1] and [2]. In this case, we use as parallel phone recognizers the ones provided by the Brno University of Technology for Czech, Hungarian, and Russian lan-guages, and instead of using traditional n-gram language models we use a lan-guage model that is created using a ranking with the most frequent and discrim-inative n-grams. In this language model approach, the distance between the ranking for the input sentence and the ranking for each language is computed, based on the difference in relative positions for each n-gram. This approach is able to model reliably longer span information than in traditional language models obtaining more reliable estimations. We also describe the modifications that we have being introducing along the time to the original ranking technique, e.g., different discriminative formulas to establish the ranking, variations of the template size, the suppression of repeated consecutive phones, and a new clus-tering technique for the ranking scores. Results show that this technique pro-vides a 12.9% relative improvement over PPRLM. Finally, we also describe re-sults where the traditional PPRLM and our ranking technique are combined.
Resumo:
This paper describes a low complexity strategy for detecting and recognizing text signs automatically. Traditional approaches use large image algorithms for detecting the text sign, followed by the application of an Optical Character Recognition (OCR) algorithm in the previously identified areas. This paper proposes a new architecture that applies the OCR to a whole lightly treated image and then carries out the text detection process of the OCR output. The strategy presented in this paper significantly reduces the processing time required for text localization in an image, while guaranteeing a high recognition rate. This strategy will facilitate the incorporation of video processing-based applications into the automatic detection of text sign similar to that of a smartphone. These applications will increase the autonomy of visually impaired people in their daily life.
Resumo:
This paper proposes a methodology for developing a speech into sign language translation system considering a user-centered strategy. This method-ology consists of four main steps: analysis of technical and user requirements, data collection, technology adaptation to the new domain, and finally, evalua-tion of the system. The two most demanding tasks are the sign generation and the translation rules generation. Many other aspects can be updated automatical-ly from a parallel corpus that includes sentences (in Spanish and LSE: Lengua de Signos Española) related to the application domain. In this paper, we explain how to apply this methodology in order to develop two translation systems in two specific domains: bus transport information and hotel reception.
Resumo:
This paper presents the SAILSE Project (Sistema Avanzado de Información en Lengua de Signos Española ? Spanish Sign Language Advanced Information System). This project aims to develop an interactive system for facilitating the communication between a hearing and a deaf person. The first step has been the linguistic study, including a sentence collection, its translation into LSE (Lengua de Signos Española - Spanish Sign Language), and sign generation. After this analysis, the paper describes the interactive system that integrates an avatar to represent the signs, a text to speech converter and several translation technologies. Finally, this paper presents the set up carried out with deaf people and the main conclusions extracted from it.
Resumo:
This paper presents a methodology for adapting an advanced communication system for deaf people in a new domain. This methodology is a user-centered design approach consisting of four main steps: requirement analysis, parallel corpus generation, technology adaptation to the new domain, and finally, system evaluation. In this paper, the new considered domain has been the dialogues in a hotel reception. With this methodology, it was possible to develop the system in a few months, obtaining very good performance: good speech recognition and translation rates (around 90%) with small processing times.
Resumo:
This paper describes the text normalization module of a text to speech fully-trainable conversion system and its application to number transcription. The main target is to generate a language independent text normalization module, based on data instead of on expert rules. This paper proposes a general architecture based on statistical machine translation techniques. This proposal is composed of three main modules: a tokenizer for splitting the text input into a token graph, a phrase-based translation module for token translation, and a post-processing module for removing some tokens. This architecture has been evaluated for number transcription in several languages: English, Spanish and Romanian. Number transcription is an important aspect in the text normalization problem.