Biblioteca Digital

41 resultados para ecosystem-based adaptation

em Universidad Politécnica de Madrid

Dynamic topic-based adaptation of language models: a comparison between different approaches

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a dynamic LM adaptation based on the topic that has been identified on a speech segment. We use LSA and the given topic labels in the training dataset to obtain and use the topic models. We propose a dynamic language model adaptation to improve the recognition performance in "a two stages" AST system. The final stage makes use of the topic identification with two variants: the first on uses just the most probable topic and the other one depends on the relative distances of the topics that have been identified. We perform the adaptation of the LM as a linear interpolation between a background model and topic-based LM. The interpolation weight id dynamically adapted according to different parameters. The proposed method is evaluated on the Spanish partition of the EPPS speech database. We achieved a relative reduction in WER of 11.13% over the baseline system which uses a single blackground LM.

Contributions to Speech Analytics based on Speech Recognition and Topic Identification

Relevância:

90.00% 90.00%

Publicador:

Resumo:

La última década ha sido testigo de importantes avances en el campo de la tecnología de reconocimiento de voz. Los sistemas comerciales existentes actualmente poseen la capacidad de reconocer habla continua de múltiples locutores, consiguiendo valores aceptables de error, y sin la necesidad de realizar procedimientos explícitos de adaptación. A pesar del buen momento que vive esta tecnología, el reconocimiento de voz dista de ser un problema resuelto. La mayoría de estos sistemas de reconocimiento se ajustan a dominios particulares y su eficacia depende de manera significativa, entre otros muchos aspectos, de la similitud que exista entre el modelo de lenguaje utilizado y la tarea específica para la cual se está empleando. Esta dependencia cobra aún más importancia en aquellos escenarios en los cuales las propiedades estadísticas del lenguaje varían a lo largo del tiempo, como por ejemplo, en dominios de aplicación que involucren habla espontánea y múltiples temáticas. En los últimos años se ha evidenciado un constante esfuerzo por mejorar los sistemas de reconocimiento para tales dominios. Esto se ha hecho, entre otros muchos enfoques, a través de técnicas automáticas de adaptación. Estas técnicas son aplicadas a sistemas ya existentes, dado que exportar el sistema a una nueva tarea o dominio puede requerir tiempo a la vez que resultar costoso. Las técnicas de adaptación requieren fuentes adicionales de información, y en este sentido, el lenguaje hablado puede aportar algunas de ellas. El habla no sólo transmite un mensaje, también transmite información acerca del contexto en el cual se desarrolla la comunicación hablada (e.g. acerca del tema sobre el cual se está hablando). Por tanto, cuando nos comunicamos a través del habla, es posible identificar los elementos del lenguaje que caracterizan el contexto, y al mismo tiempo, rastrear los cambios que ocurren en estos elementos a lo largo del tiempo. Esta información podría ser capturada y aprovechada por medio de técnicas de recuperación de información (information retrieval) y de aprendizaje de máquina (machine learning). Esto podría permitirnos, dentro del desarrollo de mejores sistemas automáticos de reconocimiento de voz, mejorar la adaptación de modelos del lenguaje a las condiciones del contexto, y por tanto, robustecer al sistema de reconocimiento en dominios con condiciones variables (tales como variaciones potenciales en el vocabulario, el estilo y la temática). En este sentido, la principal contribución de esta Tesis es la propuesta y evaluación de un marco de contextualización motivado por el análisis temático y basado en la adaptación dinámica y no supervisada de modelos de lenguaje para el robustecimiento de un sistema automático de reconocimiento de voz. Esta adaptación toma como base distintos enfoque de los sistemas mencionados (de recuperación de información y aprendizaje de máquina) mediante los cuales buscamos identificar las temáticas sobre las cuales se está hablando en una grabación de audio. Dicha identificación, por lo tanto, permite realizar una adaptación del modelo de lenguaje de acuerdo a las condiciones del contexto. El marco de contextualización propuesto se puede dividir en dos sistemas principales: un sistema de identificación de temática y un sistema de adaptación dinámica de modelos de lenguaje. Esta Tesis puede describirse en detalle desde la perspectiva de las contribuciones particulares realizadas en cada uno de los campos que componen el marco propuesto: _ En lo referente al sistema de identificación de temática, nos hemos enfocado en aportar mejoras a las técnicas de pre-procesamiento de documentos, asimismo en contribuir a la definición de criterios más robustos para la selección de index-terms. – La eficiencia de los sistemas basados tanto en técnicas de recuperación de información como en técnicas de aprendizaje de máquina, y específicamente de aquellos sistemas que particularizan en la tarea de identificación de temática, depende, en gran medida, de los mecanismos de preprocesamiento que se aplican a los documentos. Entre las múltiples operaciones que hacen parte de un esquema de preprocesamiento, la selección adecuada de los términos de indexado (index-terms) es crucial para establecer relaciones semánticas y conceptuales entre los términos y los documentos. Este proceso también puede verse afectado, o bien por una mala elección de stopwords, o bien por la falta de precisión en la definición de reglas de lematización. En este sentido, en este trabajo comparamos y evaluamos diferentes criterios para el preprocesamiento de los documentos, así como también distintas estrategias para la selección de los index-terms. Esto nos permite no sólo reducir el tamaño de la estructura de indexación, sino también mejorar el proceso de identificación de temática. – Uno de los aspectos más importantes en cuanto al rendimiento de los sistemas de identificación de temática es la asignación de diferentes pesos a los términos de acuerdo a su contribución al contenido del documento. En este trabajo evaluamos y proponemos enfoques alternativos a los esquemas tradicionales de ponderado de términos (tales como tf-idf ) que nos permitan mejorar la especificidad de los términos, así como también discriminar mejor las temáticas de los documentos. _ Respecto a la adaptación dinámica de modelos de lenguaje, hemos dividimos el proceso de contextualización en varios pasos. – Para la generación de modelos de lenguaje basados en temática, proponemos dos tipos de enfoques: un enfoque supervisado y un enfoque no supervisado. En el primero de ellos nos basamos en las etiquetas de temática que originalmente acompañan a los documentos del corpus que empleamos. A partir de estas, agrupamos los documentos que forman parte de la misma temática y generamos modelos de lenguaje a partir de dichos grupos. Sin embargo, uno de los objetivos que se persigue en esta Tesis es evaluar si el uso de estas etiquetas para la generación de modelos es óptimo en términos del rendimiento del reconocedor. Por esta razón, nosotros proponemos un segundo enfoque, un enfoque no supervisado, en el cual el objetivo es agrupar, automáticamente, los documentos en clusters temáticos, basándonos en la similaridad semántica existente entre los documentos. Por medio de enfoques de agrupamiento conseguimos mejorar la cohesión conceptual y semántica en cada uno de los clusters, lo que a su vez nos permitió refinar los modelos de lenguaje basados en temática y mejorar el rendimiento del sistema de reconocimiento. – Desarrollamos diversas estrategias para generar un modelo de lenguaje dependiente del contexto. Nuestro objetivo es que este modelo refleje el contexto semántico del habla, i.e. las temáticas más relevantes que se están discutiendo. Este modelo es generado por medio de la interpolación lineal entre aquellos modelos de lenguaje basados en temática que estén relacionados con las temáticas más relevantes. La estimación de los pesos de interpolación está basada principalmente en el resultado del proceso de identificación de temática. – Finalmente, proponemos una metodología para la adaptación dinámica de un modelo de lenguaje general. El proceso de adaptación tiene en cuenta no sólo al modelo dependiente del contexto sino también a la información entregada por el proceso de identificación de temática. El esquema usado para la adaptación es una interpolación lineal entre el modelo general y el modelo dependiente de contexto. Estudiamos también diferentes enfoques para determinar los pesos de interpolación entre ambos modelos. Una vez definida la base teórica de nuestro marco de contextualización, proponemos su aplicación dentro de un sistema automático de reconocimiento de voz. Para esto, nos enfocamos en dos aspectos: la contextualización de los modelos de lenguaje empleados por el sistema y la incorporación de información semántica en el proceso de adaptación basado en temática. En esta Tesis proponemos un marco experimental basado en una arquitectura de reconocimiento en ‘dos etapas’. En la primera etapa, empleamos sistemas basados en técnicas de recuperación de información y aprendizaje de máquina para identificar las temáticas sobre las cuales se habla en una transcripción de un segmento de audio. Esta transcripción es generada por el sistema de reconocimiento empleando un modelo de lenguaje general. De acuerdo con la relevancia de las temáticas que han sido identificadas, se lleva a cabo la adaptación dinámica del modelo de lenguaje. En la segunda etapa de la arquitectura de reconocimiento, usamos este modelo adaptado para realizar de nuevo el reconocimiento del segmento de audio. Para determinar los beneficios del marco de trabajo propuesto, llevamos a cabo la evaluación de cada uno de los sistemas principales previamente mencionados. Esta evaluación es realizada sobre discursos en el dominio de la política usando la base de datos EPPS (European Parliamentary Plenary Sessions - Sesiones Plenarias del Parlamento Europeo) del proyecto europeo TC-STAR. Analizamos distintas métricas acerca del rendimiento de los sistemas y evaluamos las mejoras propuestas con respecto a los sistemas de referencia. ABSTRACT The last decade has witnessed major advances in speech recognition technology. Today’s commercial systems are able to recognize continuous speech from numerous speakers, with acceptable levels of error and without the need for an explicit adaptation procedure. Despite this progress, speech recognition is far from being a solved problem. Most of these systems are adjusted to a particular domain and their efficacy depends significantly, among many other aspects, on the similarity between the language model used and the task that is being addressed. This dependence is even more important in scenarios where the statistical properties of the language fluctuates throughout the time, for example, in application domains involving spontaneous and multitopic speech. Over the last years there has been an increasing effort in enhancing the speech recognition systems for such domains. This has been done, among other approaches, by means of techniques of automatic adaptation. These techniques are applied to the existing systems, specially since exporting the system to a new task or domain may be both time-consuming and expensive. Adaptation techniques require additional sources of information, and the spoken language could provide some of them. It must be considered that speech not only conveys a message, it also provides information on the context in which the spoken communication takes place (e.g. on the subject on which it is being talked about). Therefore, when we communicate through speech, it could be feasible to identify the elements of the language that characterize the context, and at the same time, to track the changes that occur in those elements over time. This information can be extracted and exploited through techniques of information retrieval and machine learning. This allows us, within the development of more robust speech recognition systems, to enhance the adaptation of language models to the conditions of the context, thus strengthening the recognition system for domains under changing conditions (such as potential variations in vocabulary, style and topic). In this sense, the main contribution of this Thesis is the proposal and evaluation of a framework of topic-motivated contextualization based on the dynamic and non-supervised adaptation of language models for the enhancement of an automatic speech recognition system. This adaptation is based on an combined approach (from the perspective of both information retrieval and machine learning fields) whereby we identify the topics that are being discussed in an audio recording. The topic identification, therefore, enables the system to perform an adaptation of the language model according to the contextual conditions. The proposed framework can be divided in two major systems: a topic identification system and a dynamic language model adaptation system. This Thesis can be outlined from the perspective of the particular contributions made in each of the fields that composes the proposed framework: _ Regarding the topic identification system, we have focused on the enhancement of the document preprocessing techniques in addition to contributing in the definition of more robust criteria for the selection of index-terms. – Within both information retrieval and machine learning based approaches, the efficiency of topic identification systems, depends, to a large extent, on the mechanisms of preprocessing applied to the documents. Among the many operations that encloses the preprocessing procedures, an adequate selection of index-terms is critical to establish conceptual and semantic relationships between terms and documents. This process might also be weakened by a poor choice of stopwords or lack of precision in defining stemming rules. In this regard we compare and evaluate different criteria for preprocessing the documents, as well as for improving the selection of the index-terms. This allows us to not only reduce the size of the indexing structure but also to strengthen the topic identification process. – One of the most crucial aspects, in relation to the performance of topic identification systems, is to assign different weights to different terms depending on their contribution to the content of the document. In this sense we evaluate and propose alternative approaches to traditional weighting schemes (such as tf-idf ) that allow us to improve the specificity of terms, and to better identify the topics that are related to documents. _ Regarding the dynamic language model adaptation, we divide the contextualization process into different steps. – We propose supervised and unsupervised approaches for the generation of topic-based language models. The first of them is intended to generate topic-based language models by grouping the documents, in the training set, according to the original topic labels of the corpus. Nevertheless, a goal of this Thesis is to evaluate whether or not the use of these labels to generate language models is optimal in terms of recognition accuracy. For this reason, we propose a second approach, an unsupervised one, in which the objective is to group the data in the training set into automatic topic clusters based on the semantic similarity between the documents. By means of clustering approaches we expect to obtain a more cohesive association of the documents that are related by similar concepts, thus improving the coverage of the topic-based language models and enhancing the performance of the recognition system. – We develop various strategies in order to create a context-dependent language model. Our aim is that this model reflects the semantic context of the current utterance, i.e. the most relevant topics that are being discussed. This model is generated by means of a linear interpolation between the topic-based language models related to the most relevant topics. The estimation of the interpolation weights is based mainly on the outcome of the topic identification process. – Finally, we propose a methodology for the dynamic adaptation of a background language model. The adaptation process takes into account the context-dependent model as well as the information provided by the topic identification process. The scheme used for the adaptation is a linear interpolation between the background model and the context-dependent one. We also study different approaches to determine the interpolation weights used in this adaptation scheme. Once we defined the basis of our topic-motivated contextualization framework, we propose its application into an automatic speech recognition system. We focus on two aspects: the contextualization of the language models used by the system, and the incorporation of semantic-related information into a topic-based adaptation process. To achieve this, we propose an experimental framework based in ‘a two stages’ recognition architecture. In the first stage of the architecture, Information Retrieval and Machine Learning techniques are used to identify the topics in a transcription of an audio segment. This transcription is generated by the recognition system using a background language model. According to the confidence on the topics that have been identified, the dynamic language model adaptation is carried out. In the second stage of the recognition architecture, an adapted language model is used to re-decode the utterance. To test the benefits of the proposed framework, we carry out the evaluation of each of the major systems aforementioned. The evaluation is conducted on speeches of political domain using the EPPS (European Parliamentary Plenary Sessions) database from the European TC-STAR project. We analyse several performance metrics that allow us to compare the improvements of the proposed systems against the baseline ones.

Numerical Error Prediction and its applications in CFD using tau-estimation

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Nowadays, Computational Fluid Dynamics (CFD) solvers are widely used within the industry to model fluid flow phenomenons. Several fluid flow model equations have been employed in the last decades to simulate and predict forces acting, for example, on different aircraft configurations. Computational time and accuracy are strongly dependent on the fluid flow model equation and the spatial dimension of the problem considered. While simple models based on perfect flows, like panel methods or potential flow models can be very fast to solve, they usually suffer from a poor accuracy in order to simulate real flows (transonic, viscous). On the other hand, more complex models such as the full Navier- Stokes equations provide high fidelity predictions but at a much higher computational cost. Thus, a good compromise between accuracy and computational time has to be fixed for engineering applications. A discretisation technique widely used within the industry is the so-called Finite Volume approach on unstructured meshes. This technique spatially discretises the flow motion equations onto a set of elements which form a mesh, a discrete representation of the continuous domain. Using this approach, for a given flow model equation, the accuracy and computational time mainly depend on the distribution of nodes forming the mesh. Therefore, a good compromise between accuracy and computational time might be obtained by carefully defining the mesh. However, defining an optimal mesh for complex flows and geometries requires a very high level expertize in fluid mechanics and numerical analysis, and in most cases a simple guess of regions of the computational domain which might affect the most the accuracy is impossible. Thus, it is desirable to have an automatized remeshing tool, which is more flexible with unstructured meshes than its structured counterpart. However, adaptive methods currently in use still have an opened question: how to efficiently drive the adaptation ? Pioneering sensors based on flow features generally suffer from a lack of reliability, so in the last decade more effort has been made in developing numerical error-based sensors, like for instance the adjoint-based adaptation sensors. While very efficient at adapting meshes for a given functional output, the latter method is very expensive as it requires to solve a dual set of equations and computes the sensor on an embedded mesh. Therefore, it would be desirable to develop a more affordable numerical error estimation method. The current work aims at estimating the truncation error, which arises when discretising a partial differential equation. These are the higher order terms neglected in the construction of the numerical scheme. The truncation error provides very useful information as it is strongly related to the flow model equation and its discretisation. On one hand, it is a very reliable measure of the quality of the mesh, therefore very useful in order to drive a mesh adaptation procedure. On the other hand, it is strongly linked to the flow model equation, so that a careful estimation actually gives information on how well a given equation is solved, which may be useful in the context of _ -extrapolation or zonal modelling. The following work is organized as follows: Chap. 1 contains a short review of mesh adaptation techniques as well as numerical error prediction. In the first section, Sec. 1.1, the basic refinement strategies are reviewed and the main contribution to structured and unstructured mesh adaptation are presented. Sec. 1.2 introduces the definitions of errors encountered when solving Computational Fluid Dynamics problems and reviews the most common approaches to predict them. Chap. 2 is devoted to the mathematical formulation of truncation error estimation in the context of finite volume methodology, as well as a complete verification procedure. Several features are studied, such as the influence of grid non-uniformities, non-linearity, boundary conditions and non-converged numerical solutions. This verification part has been submitted and accepted for publication in the Journal of Computational Physics. Chap. 3 presents a mesh adaptation algorithm based on truncation error estimates and compares the results to a feature-based and an adjoint-based sensor (in collaboration with Jorge Ponsín, INTA). Two- and three-dimensional cases relevant for validation in the aeronautical industry are considered. This part has been submitted and accepted in the AIAA Journal. An extension to Reynolds Averaged Navier- Stokes equations is also included, where _ -estimation-based mesh adaptation and _ -extrapolation are applied to viscous wing profiles. The latter has been submitted in the Proceedings of the Institution of Mechanical Engineers, Part G: Journal of Aerospace Engineering. Keywords: mesh adaptation, numerical error prediction, finite volume Hoy en día, la Dinámica de Fluidos Computacional (CFD) es ampliamente utilizada dentro de la industria para obtener información sobre fenómenos fluidos. La Dinámica de Fluidos Computacional considera distintas modelizaciones de las ecuaciones fluidas (Potencial, Euler, Navier-Stokes, etc) para simular y predecir las fuerzas que actúan, por ejemplo, sobre una configuración de aeronave. El tiempo de cálculo y la precisión en la solución depende en gran medida de los modelos utilizados, así como de la dimensión espacial del problema considerado. Mientras que modelos simples basados en flujos perfectos, como modelos de flujos potenciales, se pueden resolver rápidamente, por lo general aducen de una baja precisión a la hora de simular flujos reales (viscosos, transónicos, etc). Por otro lado, modelos más complejos tales como el conjunto de ecuaciones de Navier-Stokes proporcionan predicciones de alta fidelidad, a expensas de un coste computacional mucho más elevado. Por lo tanto, en términos de aplicaciones de ingeniería se debe fijar un buen compromiso entre precisión y tiempo de cálculo. Una técnica de discretización ampliamente utilizada en la industria es el método de los Volúmenes Finitos en mallas no estructuradas. Esta técnica discretiza espacialmente las ecuaciones del movimiento del flujo sobre un conjunto de elementos que forman una malla, una representación discreta del dominio continuo. Utilizando este enfoque, para una ecuación de flujo dado, la precisión y el tiempo computacional dependen principalmente de la distribución de los nodos que forman la malla. Por consiguiente, un buen compromiso entre precisión y tiempo de cálculo se podría obtener definiendo cuidadosamente la malla, concentrando sus elementos en aquellas zonas donde sea estrictamente necesario. Sin embargo, la definición de una malla óptima para corrientes y geometrías complejas requiere un nivel muy alto de experiencia en la mecánica de fluidos y el análisis numérico, así como un conocimiento previo de la solución. Aspecto que en la mayoría de los casos no está disponible. Por tanto, es deseable tener una herramienta que permita adaptar los elementos de malla de forma automática, acorde a la solución fluida (remallado). Esta herramienta es generalmente más flexible en mallas no estructuradas que con su homóloga estructurada. No obstante, los métodos de adaptación actualmente en uso todavía dejan una pregunta abierta: cómo conducir de manera eficiente la adaptación. Sensores pioneros basados en las características del flujo en general, adolecen de una falta de fiabilidad, por lo que en la última década se han realizado grandes esfuerzos en el desarrollo numérico de sensores basados en el error, como por ejemplo los sensores basados en el adjunto. A pesar de ser muy eficientes en la adaptación de mallas para un determinado funcional, este último método resulta muy costoso, pues requiere resolver un doble conjunto de ecuaciones: la solución y su adjunta. Por tanto, es deseable desarrollar un método numérico de estimación de error más asequible. El presente trabajo tiene como objetivo estimar el error local de truncación, que aparece cuando se discretiza una ecuación en derivadas parciales. Estos son los términos de orden superior olvidados en la construcción del esquema numérico. El error de truncación proporciona una información muy útil sobre la solución: es una medida muy fiable de la calidad de la malla, obteniendo información que permite llevar a cabo un procedimiento de adaptación de malla. Está fuertemente relacionado al modelo matemático fluido, de modo que una estimación precisa garantiza la idoneidad de dicho modelo en un campo fluido, lo que puede ser útil en el contexto de modelado zonal. Por último, permite mejorar la precisión de la solución resolviendo un nuevo sistema donde el error local actúa como término fuente (_ -extrapolación). El presenta trabajo se organiza de la siguiente manera: Cap. 1 contiene una breve reseña de las técnicas de adaptación de malla, así como de los métodos de predicción de los errores numéricos. En la primera sección, Sec. 1.1, se examinan las estrategias básicas de refinamiento y se presenta la principal contribución a la adaptación de malla estructurada y no estructurada. Sec 1.2 introduce las definiciones de los errores encontrados en la resolución de problemas de Dinámica Computacional de Fluidos y se examinan los enfoques más comunes para predecirlos. Cap. 2 está dedicado a la formulación matemática de la estimación del error de truncación en el contexto de la metodología de Volúmenes Finitos, así como a un procedimiento de verificación completo. Se estudian varias características que influyen en su estimación: la influencia de la falta de uniformidad de la malla, el efecto de las no linealidades del modelo matemático, diferentes condiciones de contorno y soluciones numéricas no convergidas. Esta parte de verificación ha sido presentada y aceptada para su publicación en el Journal of Computational Physics. Cap. 3 presenta un algoritmo de adaptación de malla basado en la estimación del error de truncación y compara los resultados con sensores de featured-based y adjointbased (en colaboración con Jorge Ponsín del INTA). Se consideran casos en dos y tres dimensiones, relevantes para la validación en la industria aeronáutica. Este trabajo ha sido presentado y aceptado en el AIAA Journal. También se incluye una extensión de estos métodos a las ecuaciones RANS (Reynolds Average Navier- Stokes), en donde adaptación de malla basada en _ y _ -extrapolación son aplicados a perfiles con viscosidad de alas. Este último trabajo se ha presentado en los Actas de la Institución de Ingenieros Mecánicos, Parte G: Journal of Aerospace Engineering. Palabras clave: adaptación de malla, predicción del error numérico, volúmenes finitos

Automated end user-centred adaptation of web components through automated description logic-based reasoning

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Context: This paper addresses one of the major end-user development (EUD) challenges, namely, how to pack today?s EUD support tools with composable elements. This would give end users better access to more components which they can use to build a solution tailored to their own needs. The success of later end-user software engineering (EUSE) activities largely depends on how many components each tool has and how adaptable components are to multiple problem domains. Objective: A system for automatically adapting heterogeneous components to a common development environment would offer a sizeable saving of time and resources within the EUD support tool construction process. This paper presents an automated adaptation system for transforming EUD components to a standard format. Method: This system is based on the use of description logic. Based on a generic UML2 data model, this description logic is able to check whether an end-user component can be transformed to this modeling language through subsumption or as an instance of the UML2 model. Besides it automatically finds a consistent, non-ambiguous and finite set of XSLT mappings to automatically prepare data in order to leverage the component as part of a tool that conforms to the target UML2 component model. Results: The proposed system has been successfully applied to components from four prominent EUD tools. These components were automatically converted to a standard format. In order to validate the proposed system, rich internet applications (RIA) used as an operational support system for operators at a large services company were developed using automatically adapted standard format components. These RIAs would be impossible to develop using each EUD tool separately. Conclusion: The positive results of applying our system for automatically adapting components from current tool catalogues are indicative of the system?s effectiveness. Use of this system could foster the growth of web EUD component catalogues, leveraging a vast ecosystem of user-centred SaaS to further current EUSE trends.

Mutual Information and Perplexity based Clustering of Dialogue Information for Dynamic Adaptation of Language Models

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We present two approaches to cluster dialogue-based information obtained by the speech understanding module and the dialogue manager of a spoken dialogue system. The purpose is to estimate a language model related to each cluster, and use them to dynamically modify the model of the speech recognizer at each dialogue turn. In the first approach we build the cluster tree using local decisions based on a Maximum Normalized Mutual Information criterion. In the second one we take global decisions, based on the optimization of the global perplexity of the combination of the cluster-related LMs. Our experiments show a relative reduction of the word error rate of 15.17%, which helps to improve the performance of the understanding and the dialogue manager modules.

Mutual Information and Perplexity based Clustering of Dialogue Information for Dynamic Adaptation of Language Models

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We present two approaches to cluster dialogue-based information obtained by the speech understanding module and the dialogue manager of a spoken dialogue system. The purpose is to estimate a language model related to each cluster, and use them to dynamically modify the model of the speech recognizer at each dialogue turn. In the first approach we build the cluster tree using local decisions based on a Maximum Normalized Mutual Information criterion. In the second one we take global decisions, based on the optimization of the global perplexity of the combination of the cluster-related LMs. Our experiments show a relative reduction of the word error rate of 15.17%, which helps to improve the performance of the understanding and the dialogue manager modules.

On the dynamic adaptation of language models based on dialogue information

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We present an approach to adapt dynamically the language models (LMs) used by a speech recognizer that is part of a spoken dialogue system. We have developed a grammar generation strategy that automatically adapts the LMs using the semantic information that the user provides (represented as dialogue concepts), together with the information regarding the intentions of the speaker (inferred by the dialogue manager, and represented as dialogue goals). We carry out the adaptation as a linear interpolation between a background LM, and one or more of the LMs associated to the dialogue elements (concepts or goals) addressed by the user. The interpolation weights between those models are automatically estimated on each dialogue turn, using measures such as the posterior probabilities of concepts and goals, estimated as part of the inference procedure to determine the actions to be carried out. We propose two approaches to handle the LMs related to concepts and goals. Whereas in the first one we estimate a LM for each one of them, in the second one we apply several clustering strategies to group together those elements that share some common properties, and estimate a LM for each cluster. Our evaluation shows how the system can estimate a dynamic model adapted to each dialogue turn, which helps to improve the performance of the speech recognition (up to a 14.82% of relative improvement), which leads to an improvement in both the language understanding and the dialogue management tasks.

A sharing-based approach to supporting adaptation in service compositions

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Data-related properties of the activities involved in a service composition can be used to facilitate several design-time and run-time adaptation tasks, such as service evolution, distributed enactment, and instance-level adaptation. A number of these properties can be expressed using a notion of sharing. We present an approach for automated inference of data properties based on sharing analysis, which is able to handle service compositions with complex control structures, involving loops and sub-workflows. The properties inferred can include data dependencies, information content, domain-defined attributes, privacy or confidentiality levels, among others. The analysis produces characterizations of the data and the activities in the composition in terms of minimal and maximal sharing, which can then be used to verify compliance of potential adaptation actions, or as supporting information in their generation. This sharing analysis approach can be used both at design time and at run time. In the latter case, the results of analysis can be refined using the composition traces (execution logs) at the point of execution, in order to support run-time adaptation.

Adaptation strategies for high order discontinuous Galerkin methods based on Tau-estimation

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper three p-adaptation strategies based on the minimization of the truncation error are presented for high order discontinuous Galerkin methods. The truncation error is approximated by means of a ? -estimation procedure and enables the identification of mesh regions that require adaptation. Three adaptation strategies are developed and termed a posteriori, quasi-a priori and quasi-a priori corrected. All strategies require fine solutions, which are obtained by enriching the polynomial order, but while the former needs time converged solutions, the last two rely on non-converged solutions, which lead to faster computations. In addition, the high order method permits the spatial decoupling for the estimated errors and enables anisotropic p-adaptation. These strategies are verified and compared in terms of accuracy and computational cost for the Euler and the compressible Navier?Stokes equations. It is shown that the two quasi- a priori methods achieve a significant reduction in computational cost when compared to a uniform polynomial enrichment. Namely, for a viscous boundary layer flow, we obtain a speedup of 6.6 and 7.6 for the quasi-a priori and quasi-a priori corrected approaches, respectively.

Emotion transplantation through adaptation in HMM-based speech synthesis

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper proposes an emotion transplantation method capable of modifying a synthetic speech model through the use of CSMAPLR adaptation in order to incorporate emotional information learned from a different speaker model while maintaining the identity of the original speaker as much as possible. The proposed method relies on learning both emotional and speaker identity information by means of their adaptation function from an average voice model, and combining them into a single cascade transform capable of imbuing the desired emotion into the target speaker. This method is then applied to the task of transplanting four emotions (anger, happiness, sadness and surprise) into 3 male speakers and 3 female speakers and evaluated in a number of perceptual tests. The results of the evaluations show how the perceived naturalness for emotional text significantly favors the use of the proposed transplanted emotional speech synthesis when compared to traditional neutral speech synthesis, evidenced by a big increase in the perceived emotional strength of the synthesized utterances at a slight cost in speech quality. A final evaluation with a robotic laboratory assistant application shows how by using emotional speech we can significantly increase the students’ satisfaction with the dialog system, proving how the proposed emotion transplantation system provides benefits in real applications.

Vision-based vehicle detection and tracking with a mobile camera using a statistical framework

Relevância:

30.00% 30.00%

Publicador:

Resumo:

En esta tesis se aborda la detección y el seguimiento automático de vehículos mediante técnicas de visión artificial con una cámara monocular embarcada. Este problema ha suscitado un gran interés por parte de la industria automovilística y de la comunidad científica ya que supone el primer paso en aras de la ayuda a la conducción, la prevención de accidentes y, en última instancia, la conducción automática. A pesar de que se le ha dedicado mucho esfuerzo en los últimos años, de momento no se ha encontrado ninguna solución completamente satisfactoria y por lo tanto continúa siendo un tema de investigación abierto. Los principales problemas que plantean la detección y seguimiento mediante visión artificial son la gran variabilidad entre vehículos, un fondo que cambia dinámicamente debido al movimiento de la cámara, y la necesidad de operar en tiempo real. En este contexto, esta tesis propone un marco unificado para la detección y seguimiento de vehículos que afronta los problemas descritos mediante un enfoque estadístico. El marco se compone de tres grandes bloques, i.e., generación de hipótesis, verificación de hipótesis, y seguimiento de vehículos, que se llevan a cabo de manera secuencial. No obstante, se potencia el intercambio de información entre los diferentes bloques con objeto de obtener el máximo grado posible de adaptación a cambios en el entorno y de reducir el coste computacional. Para abordar la primera tarea de generación de hipótesis, se proponen dos métodos complementarios basados respectivamente en el análisis de la apariencia y la geometría de la escena. Para ello resulta especialmente interesante el uso de un dominio transformado en el que se elimina la perspectiva de la imagen original, puesto que este dominio permite una búsqueda rápida dentro de la imagen y por tanto una generación eficiente de hipótesis de localización de los vehículos. Los candidatos finales se obtienen por medio de un marco colaborativo entre el dominio original y el dominio transformado. Para la verificación de hipótesis se adopta un método de aprendizaje supervisado. Así, se evalúan algunos de los métodos de extracción de características más populares y se proponen nuevos descriptores con arreglo al conocimiento de la apariencia de los vehículos. Para evaluar la efectividad en la tarea de clasificación de estos descriptores, y dado que no existen bases de datos públicas que se adapten al problema descrito, se ha generado una nueva base de datos sobre la que se han realizado pruebas masivas. Finalmente, se presenta una metodología para la fusión de los diferentes clasificadores y se plantea una discusión sobre las combinaciones que ofrecen los mejores resultados. El núcleo del marco propuesto está constituido por un método Bayesiano de seguimiento basado en filtros de partículas. Se plantean contribuciones en los tres elementos fundamentales de estos filtros: el algoritmo de inferencia, el modelo dinámico y el modelo de observación. En concreto, se propone el uso de un método de muestreo basado en MCMC que evita el elevado coste computacional de los filtros de partículas tradicionales y por consiguiente permite que el modelado conjunto de múltiples vehículos sea computacionalmente viable. Por otra parte, el dominio transformado mencionado anteriormente permite la definición de un modelo dinámico de velocidad constante ya que se preserva el movimiento suave de los vehículos en autopistas. Por último, se propone un modelo de observación que integra diferentes características. En particular, además de la apariencia de los vehículos, el modelo tiene en cuenta también toda la información recibida de los bloques de procesamiento previos. El método propuesto se ejecuta en tiempo real en un ordenador de propósito general y da unos resultados sobresalientes en comparación con los métodos tradicionales. ABSTRACT This thesis addresses on-road vehicle detection and tracking with a monocular vision system. This problem has attracted the attention of the automotive industry and the research community as it is the first step for driver assistance and collision avoidance systems and for eventual autonomous driving. Although many effort has been devoted to address it in recent years, no satisfactory solution has yet been devised and thus it is an active research issue. The main challenges for vision-based vehicle detection and tracking are the high variability among vehicles, the dynamically changing background due to camera motion and the real-time processing requirement. In this thesis, a unified approach using statistical methods is presented for vehicle detection and tracking that tackles these issues. The approach is divided into three primary tasks, i.e., vehicle hypothesis generation, hypothesis verification, and vehicle tracking, which are performed sequentially. Nevertheless, the exchange of information between processing blocks is fostered so that the maximum degree of adaptation to changes in the environment can be achieved and the computational cost is alleviated. Two complementary strategies are proposed to address the first task, i.e., hypothesis generation, based respectively on appearance and geometry analysis. To this end, the use of a rectified domain in which the perspective is removed from the original image is especially interesting, as it allows for fast image scanning and coarse hypothesis generation. The final vehicle candidates are produced using a collaborative framework between the original and the rectified domains. A supervised classification strategy is adopted for the verification of the hypothesized vehicle locations. In particular, state-of-the-art methods for feature extraction are evaluated and new descriptors are proposed by exploiting the knowledge on vehicle appearance. Due to the lack of appropriate public databases, a new database is generated and the classification performance of the descriptors is extensively tested on it. Finally, a methodology for the fusion of the different classifiers is presented and the best combinations are discussed. The core of the proposed approach is a Bayesian tracking framework using particle filters. Contributions are made on its three key elements: the inference algorithm, the dynamic model and the observation model. In particular, the use of a Markov chain Monte Carlo method is proposed for sampling, which circumvents the exponential complexity increase of traditional particle filters thus making joint multiple vehicle tracking affordable. On the other hand, the aforementioned rectified domain allows for the definition of a constant-velocity dynamic model since it preserves the smooth motion of vehicles in highways. Finally, a multiple-cue observation model is proposed that not only accounts for vehicle appearance but also integrates the available information from the analysis in the previous blocks. The proposed approach is proven to run near real-time in a general purpose PC and to deliver outstanding results compared to traditional methods.

Automatic Adaptation of Airport Surface Surveillance to Sensor Quality

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes a novel method to enhance current airport surveillance systems used in Advanced Surveillance Monitoring Guidance and Control Systems (A-SMGCS). The proposed method allows for the automatic calibration of measurement models and enhanced detection of nonideal situations, increasing surveillance products integrity. It is based on the definition of a set of observables from the surveillance processing chain and a rule based expert system aimed to change the data processing methods

Explicitly Context-Aware Publish/Subscribe with Context-Invariant Subscriptions

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Although context could be exploited to improve the performance, elasticity and adaptation in most distributed systems that adopt the publish/subscribe (P/S) model of communication, only very few works have explored domains with highly dynamic context, whereas most adopted models are context agnostic. In this paper, we present the key design principles underlying a novel context-aware content-based P/S (CA-CBPS) model of communication, where the context is explicitly managed, focusing on the minimization of network overhead in domains with recurrent context changes thanks to contextual scoping. We highlight how we dealt with the main shortcomings of most of the current approaches. Our research is some of the first to study the problem of explicitly introducing context-awareness into the P/S model to capitalize on contextual information. The envisioned CA-CBPS middleware enables the cloud ecosystem of services to communicate very efficiently, in a decoupled, but contextually scoped fashion.

Analyzing training dependencies and posterior fusion in discriminant classification of apnoea patients based on sustained and connected speech

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a novel approach using both sustained vowels and connected speech, to detect obstructive sleep apnea (OSA) cases within a homogeneous group of speakers. The proposed scheme is based on state-of-the-art GMM-based classifiers, and acknowledges specifically the way in which acoustic models are trained on standard databases, as well as the complexity of the resulting models and their adaptation to specific data. Our experimental database contains a suitable number of utterances and sustained speech from healthy (i.e control) and OSA Spanish speakers. Finally, a 25.1% relative reduction in classification error is achieved when fusing continuous and sustained speech classifiers. Index Terms: obstructive sleep apnea (OSA), gaussian mixture models (GMMs), background model (BM), classifier fusion.

Bio-inspired FPGA Architecture for Self-Calibration of an Image Compression Core based on Wavelet Transforms in Embedded Systems

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A generic bio-inspired adaptive architecture for image compression suitable to be implemented in embedded systems is presented. The architecture allows the system to be tuned during its calibration phase. An evolutionary algorithm is responsible of making the system evolve towards the required performance. A prototype has been implemented in a Xilinx Virtex-5 FPGA featuring an adaptive wavelet transform core directed at improving image compression for specific types of images. An Evolution Strategy has been chosen as the search algorithm and its typical genetic operators adapted to allow for a hardware friendly implementation. HW/SW partitioning issues are also considered after a high level description of the algorithm is profiled which validates the proposed resource allocation in the device fabric. To check the robustness of the system and its adaptation capabilities, different types of images have been selected as validation patterns. A direct application of such a system is its deployment in an unknown environment during design time, letting the calibration phase adjust the system parameters so that it performs efcient image compression. Also, this prototype implementation may serve as an accelerator for the automatic design of evolved transform coefficients which are later on synthesized and implemented in a non-adaptive system in the final implementation device, whether it is a HW or SW based computing device. The architecture has been built in a modular way so that it can be easily extended to adapt other types of image processing cores. Details on this pluggable component point of view are also given in the paper.

«
1
2
3
»