Biblioteca Digital

956 resultados para Language Models

Automatic transcription of conversational telephone speech

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper discusses the Cambridge University HTK (CU-HTK) system for the automatic transcription of conversational telephone speech. A detailed discussion of the most important techniques in front-end processing, acoustic modeling and model training, language and pronunciation modeling are presented. These include the use of conversation side based cepstral normalization, vocal tract length normalization, heteroscedastic linear discriminant analysis for feature projection, minimum phone error training and speaker adaptive training, lattice-based model adaptation, confusion network based decoding and confidence score estimation, pronunciation selection, language model interpolation, and class based language models. The transcription system developed for participation in the 2002 NIST Rich Transcription evaluations of English conversational telephone speech data is presented in detail. In this evaluation the CU-HTK system gave an overall word error rate of 23.9%, which was the best performance by a statistically significant margin. Further details on the derivation of faster systems with moderate performance degradation are discussed in the context of the 2002 CU-HTK 10 × RT conversational speech transcription system. © 2005 IEEE.

Lengua y derecho a la educación en la Comunidad Autónoma del País Vasco : reflexiones jurídicas sobre el tránsito hacia un nuevo modelo

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The article analyzes the legal regime of Euskara in the education system of the Autonomous Community of the Basque Country (capv). In the capv, the legislation recognizes the right to choose the language of study during the educational cycle. The students are separated into different classrooms based on their language preference. This system of separation (of language models) has made it possible to make great strides, although its implementation also suggests aspects which, from the perspective of a pluralistic Basque society on its way towards greater social, political and language integration, call for further reflection The general model for language planning in the capv was fashioned in the eighties as a model characterized by the guarantee of spaces of language freedom, and the educational system was charged with making the learning of the region’s autochthonous language more widespread. At this point, we already have a fair degree of evidence on which to base an analysis of the system of language models and we are in a position to conclude that perhaps the educational system was given too heavy a burden. Official studies on language performance of Basque schoolchildren show (in a way that is now fully verified) that not all the students who finish their mandatory period of schooling achieve the level of knowledge of Euskara required by the regulations. When faced with this reality, it becomes necessary for us to articulate some alternative to the current configuration of the system of language models, one that will make it possible in the future to have a Basque society that is linguistically more integrated, thereby avoiding having the knowledge or lack of knowledge of one of the official languages become a language barrier between two communities. Many sides have urged a reconsideration of the system of language models. The Basque Parliament itself has requested the Department of Education to design a new system. This article analyzes the legal foundations on which the current system is built and explores the potential avenues for legal cooperation that would make it possible to move towards a new system aimed at guaranteeing higher rates of bilingualism. The system would be sufficiently flexible so as to be able to respond to and accommodate the different sociolinguistic realities of the region.

Unsupervised Solution Post Identification from Discussion Forums

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Discussion forums have evolved into a dependablesource of knowledge to solvecommon problems. However, only a minorityof the posts in discussion forumsare solution posts. Identifying solutionposts from discussion forums, hence, is animportant research problem. In this paper,we present a technique for unsupervisedsolution post identification leveraginga so far unexplored textual feature, thatof lexical correlations between problemsand solutions. We use translation modelsand language models to exploit lexicalcorrelations and solution post characterrespectively. Our technique is designedto not rely much on structural featuressuch as post metadata since suchfeatures are often not uniformly availableacross forums. Our clustering-based iterativesolution identification approach basedon the EM-formulation performs favorablyin an empirical evaluation, beatingthe only unsupervised solution identificationtechnique from literature by a verylarge margin. We also show that our unsupervisedtechnique is competitive againstmethods that require supervision, outperformingone such technique comfortably.

Two-part Segmentation of Text Documents

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We consider the problem of segmenting text documents that have a
two-part structure such as a problem part and a solution part. Documents
of this genre include incident reports that typically involve
description of events relating to a problem followed by those pertaining
to the solution that was tried. Segmenting such documents
into the component two parts would render them usable in knowledge
reuse frameworks such as Case-Based Reasoning. This segmentation
problem presents a hard case for traditional text segmentation
due to the lexical inter-relatedness of the segments. We develop
a two-part segmentation technique that can harness a corpus
of similar documents to model the behavior of the two segments
and their inter-relatedness using language models and translation
models respectively. In particular, we use separate language models
for the problem and solution segment types, whereas the interrelatedness
between segment types is modeled using an IBM Model
1 translation model. We model documents as being generated starting
from the problem part that comprises of words sampled from
the problem language model, followed by the solution part whose
words are sampled either from the solution language model or from
a translation model conditioned on the words already chosen in the
problem part. We show, through an extensive set of experiments on
real-world data, that our approach outperforms the state-of-the-art
text segmentation algorithms in the accuracy of segmentation, and
that such improved accuracy translates well to improved usability
in Case-based Reasoning systems. We also analyze the robustness
of our technique to varying amounts and types of noise and empirically
illustrate that our technique is quite noise tolerant, and
degrades gracefully with increasing amounts of noise

Intégration du contexte en traduction statistique à l’aide d’un perceptron à plusieurs couches

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Les systèmes de traduction statistique à base de segments traduisent les phrases un segment à la fois, en plusieurs étapes. À chaque étape, ces systèmes ne considèrent que très peu d’informations pour choisir la traduction d’un segment. Les scores du dictionnaire de segments bilingues sont calculés sans égard aux contextes dans lesquels ils sont utilisés et les modèles de langue ne considèrent que les quelques mots entourant le segment traduit.Dans cette thèse, nous proposons un nouveau modèle considérant la phrase en entier lors de la sélection de chaque mot cible. Notre modèle d’intégration du contexte se différentie des précédents par l’utilisation d’un ppc (perceptron à plusieurs couches). Une propriété intéressante des ppc est leur couche cachée, qui propose une représentation alternative à celle offerte par les mots pour encoder les phrases à traduire. Une évaluation superficielle de cette représentation alter- native nous a montré qu’elle est capable de regrouper certaines phrases sources similaires même si elles étaient formulées différemment. Nous avons d’abord comparé avantageusement les prédictions de nos ppc à celles d’ibm1, un modèle couramment utilisé en traduction. Nous avons ensuite intégré nos ppc à notre système de traduction statistique de l’anglais vers le français. Nos ppc ont amélioré les traductions de notre système de base et d’un deuxième système de référence auquel était intégré IBM1.

L'atténuation statistique des surdétections d'un correcteur grammatical symbolique

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Les logiciels de correction grammaticale commettent parfois des détections illégitimes (fausses alertes), que nous appelons ici surdétections. La présente étude décrit les expériences de mise au point d’un système créé pour identifier et mettre en sourdine les surdétections produites par le correcteur du français conçu par la société Druide informatique. Plusieurs classificateurs ont été entraînés de manière supervisée sur 14 types de détections faites par le correcteur, en employant des traits couvrant di-verses informations linguistiques (dépendances et catégories syntaxiques, exploration du contexte des mots, etc.) extraites de phrases avec et sans surdétections. Huit des 14 classificateurs développés sont maintenant intégrés à la nouvelle version d’un correcteur commercial très populaire. Nos expériences ont aussi montré que les modèles de langue probabilistes, les SVM et la désambiguïsation sémantique améliorent la qualité de ces classificateurs. Ce travail est un exemple réussi de déploiement d’une approche d’apprentissage machine au service d’une application langagière grand public robuste.

The Unsupervised Acquisition of a Lexicon from Continuous Speech

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present an unsupervised learning algorithm that acquires a natural-language lexicon from raw speech. The algorithm is based on the optimal encoding of symbol sequences in an MDL framework, and uses a hierarchical representation of language that overcomes many of the problems that have stymied previous grammar-induction procedures. The forward mapping from symbol sequences to the speech stream is modeled using features based on articulatory gestures. We present results on the acquisition of lexicons and language models from raw speech, text, and phonetic transcripts, and demonstrate that our algorithm compares very favorably to other reported results with respect to segmentation performance and statistical efficiency.

Data-Driven Text Generation using Neural Networks & Provenance is Complicated and Boring — Is there a solution?

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Title: Data-Driven Text Generation using Neural Networks Speaker: Pavlos Vougiouklis, University of Southampton Abstract: Recent work on neural networks shows their great potential at tackling a wide variety of Natural Language Processing (NLP) tasks. This talk will focus on the Natural Language Generation (NLG) problem and, more specifically, on the extend to which neural network language models could be employed for context-sensitive and data-driven text generation. In addition, a neural network architecture for response generation in social media along with the training methods that enable it to capture contextual information and effectively participate in public conversations will be discussed. Speaker Bio: Pavlos Vougiouklis obtained his 5-year Diploma in Electrical and Computer Engineering from the Aristotle University of Thessaloniki in 2013. He was awarded an MSc degree in Software Engineering from the University of Southampton in 2014. In 2015, he joined the Web and Internet Science (WAIS) research group of the University of Southampton and he is currently working towards the acquisition of his PhD degree in the field of Neural Network Approaches for Natural Language Processing. Title: Provenance is Complicated and Boring — Is there a solution? Speaker: Darren Richardson, University of Southampton Abstract: Paper trails, auditing, and accountability — arguably not the sexiest terms in computer science. But then you discover that you've possibly been eating horse-meat, and the importance of provenance becomes almost palpable. Having accepted that we should be creating provenance-enabled systems, the challenge of then communicating that provenance to casual users is not trivial: users should not have to have a detailed working knowledge of your system, and they certainly shouldn't be expected to understand the data model. So how, then, do you give users an insight into the provenance, without having to build a bespoke system for each and every different provenance installation? Speaker Bio: Darren is a final year Computer Science PhD student. He completed his undergraduate degree in Electronic Engineering at Southampton in 2012.

Um software de reconhecimento de voz para português brasileiro

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Descreve a implementação de um software de reconhecimento de voz para o Português Brasileiro. Dentre os objetivos do trabalho tem-se a construção de um sistema de voz contínua para grandes vocabulários, apto a ser usado em aplicações em tempo-real. São apresentados os principais conceitos e características de tais sistemas, além de todos os passos necessários para construção. Como parte desse trabalho foram produzidos e disponibilizados vários recursos: modelos acústicos e de linguagem, novos corpora de voz e texto. O corpus de texto vem sendo construído através da extração e formatação automática de textos de jornais na Internet. Além disso, foram produzidos dois corpora de voz, um baseado em audiobooks e outro produzido especificamente para simular testes em tempo-real. O trabalho também propõe a utilização de técnicas de adaptação de locutor para resolução de problemas de descasamento acústico entre corpora de voz. Por último, é apresentada uma interface de programação de aplicativos que busca facilitar a utilização do decodificador Julius. Testes de desempenho são apresentados, comparando os sistemas desenvolvidos e um software comercial.

White matter in aphasia: a historical review of the Dejerines' studies

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The Objective was to describe the contributions of Joseph Jules Dejerine and his wife Augusta Dejerine-Klumpke to our understanding of cerebral association fiber tracts and language processing. The Dejerines (and not Constantin von Monakow) were the first to describe the superior longitudinal fasciculus/arcuate fasciculus (SLF/AF) as an association fiber tract uniting Broca's area, Wernicke's area, and a visual image center in the angular gyrus of a left hemispheric language zone. They were also the first to attribute language-related functions to the fasciculi occipito-frontalis (FOF) and the inferior longitudinal fasciculus (ILF) after describing aphasia patients with degeneration of the SLF/AF, ILF, uncinate fasciculus (UF), and FOF. These fasciculi belong to a functional network known as the Dejerines' language zone, which exceeds the borders of the classically defined cortical language centers. The Dejerines provided the first descriptions of the anatomical pillars of present-day language models (such as the SLF/AF). Their anatomical descriptions of fasciculi in aphasia patients provided a foundation for our modern concept of the dorsal and ventral streams in language processing.

GTH-UPM system for search on speech evaluation

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper describes the GTH-UPM system for the Albayzin 2014 Search on Speech Evaluation. Teh evaluation task consists of searching a list of terms/queries in audio files. The GTH-UPM system we are presenting is based on a LVCSR (Large Vocabulary Continuous Speech Recognition) system. We have used MAVIR corpus and the Spanish partition of the EPPS (European Parliament Plenary Sessions) database for training both acoustic and language models. The main effort has been focused on lexicon preparation and text selection for the language model construction. The system makes use of different lexicon and language models depending on the task that is performed. For the best configuration of the system on the development set, we have obtained a FOM of 75.27 for the deyword spotting task.

Contributions to Speech Analytics based on Speech Recognition and Topic Identification

Relevância:

60.00% 60.00%

Publicador:

Resumo:

La última década ha sido testigo de importantes avances en el campo de la tecnología de reconocimiento de voz. Los sistemas comerciales existentes actualmente poseen la capacidad de reconocer habla continua de múltiples locutores, consiguiendo valores aceptables de error, y sin la necesidad de realizar procedimientos explícitos de adaptación. A pesar del buen momento que vive esta tecnología, el reconocimiento de voz dista de ser un problema resuelto. La mayoría de estos sistemas de reconocimiento se ajustan a dominios particulares y su eficacia depende de manera significativa, entre otros muchos aspectos, de la similitud que exista entre el modelo de lenguaje utilizado y la tarea específica para la cual se está empleando. Esta dependencia cobra aún más importancia en aquellos escenarios en los cuales las propiedades estadísticas del lenguaje varían a lo largo del tiempo, como por ejemplo, en dominios de aplicación que involucren habla espontánea y múltiples temáticas. En los últimos años se ha evidenciado un constante esfuerzo por mejorar los sistemas de reconocimiento para tales dominios. Esto se ha hecho, entre otros muchos enfoques, a través de técnicas automáticas de adaptación. Estas técnicas son aplicadas a sistemas ya existentes, dado que exportar el sistema a una nueva tarea o dominio puede requerir tiempo a la vez que resultar costoso. Las técnicas de adaptación requieren fuentes adicionales de información, y en este sentido, el lenguaje hablado puede aportar algunas de ellas. El habla no sólo transmite un mensaje, también transmite información acerca del contexto en el cual se desarrolla la comunicación hablada (e.g. acerca del tema sobre el cual se está hablando). Por tanto, cuando nos comunicamos a través del habla, es posible identificar los elementos del lenguaje que caracterizan el contexto, y al mismo tiempo, rastrear los cambios que ocurren en estos elementos a lo largo del tiempo. Esta información podría ser capturada y aprovechada por medio de técnicas de recuperación de información (information retrieval) y de aprendizaje de máquina (machine learning). Esto podría permitirnos, dentro del desarrollo de mejores sistemas automáticos de reconocimiento de voz, mejorar la adaptación de modelos del lenguaje a las condiciones del contexto, y por tanto, robustecer al sistema de reconocimiento en dominios con condiciones variables (tales como variaciones potenciales en el vocabulario, el estilo y la temática). En este sentido, la principal contribución de esta Tesis es la propuesta y evaluación de un marco de contextualización motivado por el análisis temático y basado en la adaptación dinámica y no supervisada de modelos de lenguaje para el robustecimiento de un sistema automático de reconocimiento de voz. Esta adaptación toma como base distintos enfoque de los sistemas mencionados (de recuperación de información y aprendizaje de máquina) mediante los cuales buscamos identificar las temáticas sobre las cuales se está hablando en una grabación de audio. Dicha identificación, por lo tanto, permite realizar una adaptación del modelo de lenguaje de acuerdo a las condiciones del contexto. El marco de contextualización propuesto se puede dividir en dos sistemas principales: un sistema de identificación de temática y un sistema de adaptación dinámica de modelos de lenguaje. Esta Tesis puede describirse en detalle desde la perspectiva de las contribuciones particulares realizadas en cada uno de los campos que componen el marco propuesto: _ En lo referente al sistema de identificación de temática, nos hemos enfocado en aportar mejoras a las técnicas de pre-procesamiento de documentos, asimismo en contribuir a la definición de criterios más robustos para la selección de index-terms. – La eficiencia de los sistemas basados tanto en técnicas de recuperación de información como en técnicas de aprendizaje de máquina, y específicamente de aquellos sistemas que particularizan en la tarea de identificación de temática, depende, en gran medida, de los mecanismos de preprocesamiento que se aplican a los documentos. Entre las múltiples operaciones que hacen parte de un esquema de preprocesamiento, la selección adecuada de los términos de indexado (index-terms) es crucial para establecer relaciones semánticas y conceptuales entre los términos y los documentos. Este proceso también puede verse afectado, o bien por una mala elección de stopwords, o bien por la falta de precisión en la definición de reglas de lematización. En este sentido, en este trabajo comparamos y evaluamos diferentes criterios para el preprocesamiento de los documentos, así como también distintas estrategias para la selección de los index-terms. Esto nos permite no sólo reducir el tamaño de la estructura de indexación, sino también mejorar el proceso de identificación de temática. – Uno de los aspectos más importantes en cuanto al rendimiento de los sistemas de identificación de temática es la asignación de diferentes pesos a los términos de acuerdo a su contribución al contenido del documento. En este trabajo evaluamos y proponemos enfoques alternativos a los esquemas tradicionales de ponderado de términos (tales como tf-idf ) que nos permitan mejorar la especificidad de los términos, así como también discriminar mejor las temáticas de los documentos. _ Respecto a la adaptación dinámica de modelos de lenguaje, hemos dividimos el proceso de contextualización en varios pasos. – Para la generación de modelos de lenguaje basados en temática, proponemos dos tipos de enfoques: un enfoque supervisado y un enfoque no supervisado. En el primero de ellos nos basamos en las etiquetas de temática que originalmente acompañan a los documentos del corpus que empleamos. A partir de estas, agrupamos los documentos que forman parte de la misma temática y generamos modelos de lenguaje a partir de dichos grupos. Sin embargo, uno de los objetivos que se persigue en esta Tesis es evaluar si el uso de estas etiquetas para la generación de modelos es óptimo en términos del rendimiento del reconocedor. Por esta razón, nosotros proponemos un segundo enfoque, un enfoque no supervisado, en el cual el objetivo es agrupar, automáticamente, los documentos en clusters temáticos, basándonos en la similaridad semántica existente entre los documentos. Por medio de enfoques de agrupamiento conseguimos mejorar la cohesión conceptual y semántica en cada uno de los clusters, lo que a su vez nos permitió refinar los modelos de lenguaje basados en temática y mejorar el rendimiento del sistema de reconocimiento. – Desarrollamos diversas estrategias para generar un modelo de lenguaje dependiente del contexto. Nuestro objetivo es que este modelo refleje el contexto semántico del habla, i.e. las temáticas más relevantes que se están discutiendo. Este modelo es generado por medio de la interpolación lineal entre aquellos modelos de lenguaje basados en temática que estén relacionados con las temáticas más relevantes. La estimación de los pesos de interpolación está basada principalmente en el resultado del proceso de identificación de temática. – Finalmente, proponemos una metodología para la adaptación dinámica de un modelo de lenguaje general. El proceso de adaptación tiene en cuenta no sólo al modelo dependiente del contexto sino también a la información entregada por el proceso de identificación de temática. El esquema usado para la adaptación es una interpolación lineal entre el modelo general y el modelo dependiente de contexto. Estudiamos también diferentes enfoques para determinar los pesos de interpolación entre ambos modelos. Una vez definida la base teórica de nuestro marco de contextualización, proponemos su aplicación dentro de un sistema automático de reconocimiento de voz. Para esto, nos enfocamos en dos aspectos: la contextualización de los modelos de lenguaje empleados por el sistema y la incorporación de información semántica en el proceso de adaptación basado en temática. En esta Tesis proponemos un marco experimental basado en una arquitectura de reconocimiento en ‘dos etapas’. En la primera etapa, empleamos sistemas basados en técnicas de recuperación de información y aprendizaje de máquina para identificar las temáticas sobre las cuales se habla en una transcripción de un segmento de audio. Esta transcripción es generada por el sistema de reconocimiento empleando un modelo de lenguaje general. De acuerdo con la relevancia de las temáticas que han sido identificadas, se lleva a cabo la adaptación dinámica del modelo de lenguaje. En la segunda etapa de la arquitectura de reconocimiento, usamos este modelo adaptado para realizar de nuevo el reconocimiento del segmento de audio. Para determinar los beneficios del marco de trabajo propuesto, llevamos a cabo la evaluación de cada uno de los sistemas principales previamente mencionados. Esta evaluación es realizada sobre discursos en el dominio de la política usando la base de datos EPPS (European Parliamentary Plenary Sessions - Sesiones Plenarias del Parlamento Europeo) del proyecto europeo TC-STAR. Analizamos distintas métricas acerca del rendimiento de los sistemas y evaluamos las mejoras propuestas con respecto a los sistemas de referencia. ABSTRACT The last decade has witnessed major advances in speech recognition technology. Today’s commercial systems are able to recognize continuous speech from numerous speakers, with acceptable levels of error and without the need for an explicit adaptation procedure. Despite this progress, speech recognition is far from being a solved problem. Most of these systems are adjusted to a particular domain and their efficacy depends significantly, among many other aspects, on the similarity between the language model used and the task that is being addressed. This dependence is even more important in scenarios where the statistical properties of the language fluctuates throughout the time, for example, in application domains involving spontaneous and multitopic speech. Over the last years there has been an increasing effort in enhancing the speech recognition systems for such domains. This has been done, among other approaches, by means of techniques of automatic adaptation. These techniques are applied to the existing systems, specially since exporting the system to a new task or domain may be both time-consuming and expensive. Adaptation techniques require additional sources of information, and the spoken language could provide some of them. It must be considered that speech not only conveys a message, it also provides information on the context in which the spoken communication takes place (e.g. on the subject on which it is being talked about). Therefore, when we communicate through speech, it could be feasible to identify the elements of the language that characterize the context, and at the same time, to track the changes that occur in those elements over time. This information can be extracted and exploited through techniques of information retrieval and machine learning. This allows us, within the development of more robust speech recognition systems, to enhance the adaptation of language models to the conditions of the context, thus strengthening the recognition system for domains under changing conditions (such as potential variations in vocabulary, style and topic). In this sense, the main contribution of this Thesis is the proposal and evaluation of a framework of topic-motivated contextualization based on the dynamic and non-supervised adaptation of language models for the enhancement of an automatic speech recognition system. This adaptation is based on an combined approach (from the perspective of both information retrieval and machine learning fields) whereby we identify the topics that are being discussed in an audio recording. The topic identification, therefore, enables the system to perform an adaptation of the language model according to the contextual conditions. The proposed framework can be divided in two major systems: a topic identification system and a dynamic language model adaptation system. This Thesis can be outlined from the perspective of the particular contributions made in each of the fields that composes the proposed framework: _ Regarding the topic identification system, we have focused on the enhancement of the document preprocessing techniques in addition to contributing in the definition of more robust criteria for the selection of index-terms. – Within both information retrieval and machine learning based approaches, the efficiency of topic identification systems, depends, to a large extent, on the mechanisms of preprocessing applied to the documents. Among the many operations that encloses the preprocessing procedures, an adequate selection of index-terms is critical to establish conceptual and semantic relationships between terms and documents. This process might also be weakened by a poor choice of stopwords or lack of precision in defining stemming rules. In this regard we compare and evaluate different criteria for preprocessing the documents, as well as for improving the selection of the index-terms. This allows us to not only reduce the size of the indexing structure but also to strengthen the topic identification process. – One of the most crucial aspects, in relation to the performance of topic identification systems, is to assign different weights to different terms depending on their contribution to the content of the document. In this sense we evaluate and propose alternative approaches to traditional weighting schemes (such as tf-idf ) that allow us to improve the specificity of terms, and to better identify the topics that are related to documents. _ Regarding the dynamic language model adaptation, we divide the contextualization process into different steps. – We propose supervised and unsupervised approaches for the generation of topic-based language models. The first of them is intended to generate topic-based language models by grouping the documents, in the training set, according to the original topic labels of the corpus. Nevertheless, a goal of this Thesis is to evaluate whether or not the use of these labels to generate language models is optimal in terms of recognition accuracy. For this reason, we propose a second approach, an unsupervised one, in which the objective is to group the data in the training set into automatic topic clusters based on the semantic similarity between the documents. By means of clustering approaches we expect to obtain a more cohesive association of the documents that are related by similar concepts, thus improving the coverage of the topic-based language models and enhancing the performance of the recognition system. – We develop various strategies in order to create a context-dependent language model. Our aim is that this model reflects the semantic context of the current utterance, i.e. the most relevant topics that are being discussed. This model is generated by means of a linear interpolation between the topic-based language models related to the most relevant topics. The estimation of the interpolation weights is based mainly on the outcome of the topic identification process. – Finally, we propose a methodology for the dynamic adaptation of a background language model. The adaptation process takes into account the context-dependent model as well as the information provided by the topic identification process. The scheme used for the adaptation is a linear interpolation between the background model and the context-dependent one. We also study different approaches to determine the interpolation weights used in this adaptation scheme. Once we defined the basis of our topic-motivated contextualization framework, we propose its application into an automatic speech recognition system. We focus on two aspects: the contextualization of the language models used by the system, and the incorporation of semantic-related information into a topic-based adaptation process. To achieve this, we propose an experimental framework based in ‘a two stages’ recognition architecture. In the first stage of the architecture, Information Retrieval and Machine Learning techniques are used to identify the topics in a transcription of an audio segment. This transcription is generated by the recognition system using a background language model. According to the confidence on the topics that have been identified, the dynamic language model adaptation is carried out. In the second stage of the recognition architecture, an adapted language model is used to re-decode the utterance. To test the benefits of the proposed framework, we carry out the evaluation of each of the major systems aforementioned. The evaluation is conducted on speeches of political domain using the EPPS (European Parliamentary Plenary Sessions) database from the European TC-STAR project. We analyse several performance metrics that allow us to compare the improvements of the proposed systems against the baseline ones.

Análisis de las herramientas ORCC y Vivado HLS para la Síntesis de Modelos de Flujo de Datos RVC-CAL

Relevância:

60.00% 60.00%

Publicador:

Resumo:

En este Proyecto Fin de Grado se ha realizado un estudio de cómo generar, a partir de modelos de flujo de datos en RVC-CAL (Reconfigurable Video Coding – CAL Actor Language), modelos VHDL (Versatile Hardware Description Language) mediante Vivado HLS (Vivado High Level Synthesis), incluida en las herramientas disponibles en Vivado de Xilinx. Una vez conseguido el modelo VHDL resultante, la intención es que mediante las herramientas de Xilinx se programe en una FPGA (Field Programmable Gate Array) o el dispositivo Zynq también desarrollado por Xilinx. RVC-CAL es un lenguaje de flujo de datos que describe la funcionalidad de bloques funcionales, denominados actores. Las funcionalidades que desarrolla un actor se definen como acciones, las cuales pueden ser diferentes en un mismo actor. Los actores pueden comunicarse entre sí y formar una red de actores o network. Con Vivado HLS podemos obtener un diseño VHDL a partir de un modelo en lenguaje C. Por lo que la generación de modelos en VHDL a partir de otros en RVC-CAL, requiere una fase previa en la que los modelos en RVC-CAL serán compilados para conseguir su equivalente en lenguaje C. El compilador ORCC (Open RVC-CAL Compiler) es la herramienta que nos permite lograr diseños en lenguaje C partiendo de modelos en RVC-CAL. ORCC no crea directamente el código ejecutable, sino que genera un código fuente disponible para ser compilado por otra herramienta, en el caso de este proyecto, el compilador GCC (Gnu C Compiler) de Linux. En resumen en este proyecto nos encontramos con tres puntos de estudio bien diferenciados, los cuales son: 1. Partimos de modelos de flujo de datos en RVC-CAL, los cuales son compilados por ORCC para alcanzar su traducción en lenguaje C. 2. Una vez conseguidos los diseños equivalentes en lenguaje C, son sintetizados en Vivado HLS para conseguir los modelos en VHDL. 3. Los modelos VHDL resultantes serian manipulados por las herramientas de Xilinx para producir el bitstream que sea programado en una FPGA o en el dispositivo Zynq. En el estudio del segundo punto, nos encontramos con una serie de elementos conflictivos que afectan a la síntesis en Vivado HLS de los diseños en lenguaje C generados por ORCC. Estos elementos están relacionados con la manera que se encuentra estructurada la especificación en C generada por ORCC y que Vivado HLS no puede soportar en determinados momentos de la síntesis. De esta manera se ha propuesto una transformación “manual” de los diseños generados por ORCC que afecto lo menos posible a los modelos originales para poder realizar la síntesis con Vivado HLS y crear el fichero VHDL correcto. De esta forma este documento se estructura siguiendo el modelo de un trabajo de investigación. En primer lugar, se exponen las motivaciones y objetivos que apoyan y se esperan lograr en este trabajo. Seguidamente, se pone de manifiesto un análisis del estado del arte de los elementos necesarios para el desarrollo del mismo, proporcionando los conceptos básicos para la correcta comprensión y estudio del documento. Se realiza una descripción de los lenguajes RVC-CAL y VHDL, además de una introducción de las herramientas ORCC y Vivado, analizando las bondades y características principales de ambas. Una vez conocido el comportamiento de ambas herramientas, se describen las soluciones desarrolladas en nuestro estudio de la síntesis de modelos en RVC-CAL, poniéndose de manifiesto los puntos conflictivos anteriormente señalados que Vivado HLS no puede soportar en la síntesis de los diseños en lenguaje C generados por el compilador ORCC. A continuación se presentan las soluciones propuestas a estos errores acontecidos durante la síntesis, con las cuales se pretende alcanzar una especificación en C más óptima para una correcta síntesis en Vivado HLS y alcanzar de esta forma los modelos VHDL adecuados. Por último, como resultado final de este trabajo se extraen un conjunto de conclusiones sobre todos los análisis y desarrollos acontecidos en el mismo. Al mismo tiempo se proponen una serie de líneas futuras de trabajo con las que se podría continuar el estudio y completar la investigación desarrollada en este documento. ABSTRACT. In this Project it has made a study of how to generate, from data flow models in RVC-CAL (Reconfigurable Video Coding - Actor CAL Language), VHDL models (Versatile Hardware Description Language) by Vivado HLS (Vivado High Level Synthesis), included in the tools available in Vivado of Xilinx. Once achieved the resulting VHDL model, the intention is that by the Xilinx tools programmed in FPGA or Zynq device also developed by Xilinx. RVC-CAL is a dataflow language that describes the functionality of functional blocks, called actors. The functionalities developed by an actor are defined as actions, which may be different in the same actor. Actors can communicate with each other and form a network of actors. With Vivado HLS we can get a VHDL design from a model in C. So the generation of models in VHDL from others in RVC-CAL requires a preliminary phase in which the models RVC-CAL will be compiled to get its equivalent in C. The compiler ORCC (Open RVC-CAL Compiler) is the tool that allows us to achieve designs in C language models based on RVC-CAL. ORCC not directly create the executable code but generates an available source code to be compiled by another tool, in the case of this project, the GCC compiler (GNU C Compiler) of Linux. In short, in this project we find three well-defined points of study, which are: 1. We start from data flow models in RVC-CAL, which are compiled by ORCC to achieve its translation in C. 2. Once you realize the equivalent designs in C, they are synthesized in Vivado HLS for VHDL models. 3. The resulting models VHDL would be manipulated by Xilinx tools to produce the bitstream that is programmed into an FPGA or Zynq device. In the study of the second point, we find a number of conflicting elements that affect the synthesis Vivado HLS designs in C generated by ORCC. These elements are related to the way it is structured specification in C generated ORCC and Vivado HLS cannot hold at certain times of the synthesis. Thus it has proposed a "manual" transformation of designs generated by ORCC that affected as little as possible to the original in order to perform the synthesis Vivado HLS and create the correct file VHDL models. Thus this document is structured along the lines of a research. First, the motivations and objectives that support and hope to reach in this work are presented. Then it shows an analysis the state of the art of the elements necessary for its development, providing the basics for a correct understanding and study of the document. A description of the RVC-CAL and VHDL languages is made, in addition an introduction of the ORCC and Vivado tools, analyzing the advantages and main features of both. Once you know the behavior of both tools, the solutions developed in our study of the synthesis of RVC-CAL models, introducing the conflicting points mentioned above are described that Vivado HLS cannot stand in the synthesis of design in C language generated by ORCC compiler. Below the proposed solutions to these errors occurred during synthesis, with which it is intended to achieve optimum C specification for proper synthesis Vivado HLS and thus create the appropriate VHDL models are presented. Finally, as the end result of this work a set of conclusions on all analyzes and developments occurred in the same are removed. At the same time a series of future lines of work which could continue to study and complete the research developed in this document are proposed.

К Анализу Естественно-Языковых Объектов

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Рассматриваются проблемы анализа естественно-языковых объектов (ЕЯО) с точки зрения их представления и обработки в памяти компьютера. Предложена формализация задачи анализа ЕЯО и приведен пример формализованного представления ЕЯО предметной области.

Обработка Предложений Естественного Языка с Использованием Словарей и Частоты Появления Слов

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Описывается один из подходов к анализу естественно-языкового текста, который использует толковый словарь естественного языка, локальный словарь анализируемого текста и частотные характеристики слов в этом тексте.

«
1
2
3
4
5
6
7
8
...
63
64
»