28 resultados para Word error rate
em Universidad Politécnica de Madrid
Resumo:
We present two approaches to cluster dialogue-based information obtained by the speech understanding module and the dialogue manager of a spoken dialogue system. The purpose is to estimate a language model related to each cluster, and use them to dynamically modify the model of the speech recognizer at each dialogue turn. In the first approach we build the cluster tree using local decisions based on a Maximum Normalized Mutual Information criterion. In the second one we take global decisions, based on the optimization of the global perplexity of the combination of the cluster-related LMs. Our experiments show a relative reduction of the word error rate of 15.17%, which helps to improve the performance of the understanding and the dialogue manager modules.
Resumo:
We present two approaches to cluster dialogue-based information obtained by the speech understanding module and the dialogue manager of a spoken dialogue system. The purpose is to estimate a language model related to each cluster, and use them to dynamically modify the model of the speech recognizer at each dialogue turn. In the first approach we build the cluster tree using local decisions based on a Maximum Normalized Mutual Information criterion. In the second one we take global decisions, based on the optimization of the global perplexity of the combination of the cluster-related LMs. Our experiments show a relative reduction of the word error rate of 15.17%, which helps to improve the performance of the understanding and the dialogue manager modules.
Resumo:
OntoTag - A Linguistic and Ontological Annotation Model Suitable for the Semantic Web
1. INTRODUCTION. LINGUISTIC TOOLS AND ANNOTATIONS: THEIR LIGHTS AND SHADOWS
Computational Linguistics is already a consolidated research area. It builds upon the results of other two major ones, namely Linguistics and Computer Science and Engineering, and it aims at developing computational models of human language (or natural language, as it is termed in this area). Possibly, its most well-known applications are the different tools developed so far for processing human language, such as machine translation systems and speech recognizers or dictation programs.
These tools for processing human language are commonly referred to as linguistic tools. Apart from the examples mentioned above, there are also other types of linguistic tools that perhaps are not so well-known, but on which most of the other applications of Computational Linguistics are built. These other types of linguistic tools comprise POS taggers, natural language parsers and semantic taggers, amongst others. All of them can be termed linguistic annotation tools.
Linguistic annotation tools are important assets. In fact, POS and semantic taggers (and, to a lesser extent, also natural language parsers) have become critical resources for the computer applications that process natural language. Hence, any computer application that has to analyse a text automatically and ‘intelligently’ will include at least a module for POS tagging. The more an application needs to ‘understand’ the meaning of the text it processes, the more linguistic tools and/or modules it will incorporate and integrate.
However, linguistic annotation tools have still some limitations, which can be summarised as follows:
1. Normally, they perform annotations only at a certain linguistic level (that is, Morphology, Syntax, Semantics, etc.).
2. They usually introduce a certain rate of errors and ambiguities when tagging. This error rate ranges from 10 percent up to 50 percent of the units annotated for unrestricted, general texts.
3. Their annotations are most frequently formulated in terms of an annotation schema designed and implemented ad hoc.
A priori, it seems that the interoperation and the integration of several linguistic tools into an appropriate software architecture could most likely solve the limitations stated in (1). Besides, integrating several linguistic annotation tools and making them interoperate could also minimise the limitation stated in (2). Nevertheless, in the latter case, all these tools should produce annotations for a common level, which would have to be combined in order to correct their corresponding errors and inaccuracies. Yet, the limitation stated in (3) prevents both types of integration and interoperation from being easily achieved.
In addition, most high-level annotation tools rely on other lower-level annotation tools and their outputs to generate their own ones. For example, sense-tagging tools (operating at the semantic level) often use POS taggers (operating at a lower level, i.e., the morphosyntactic) to identify the grammatical category of the word or lexical unit they are annotating. Accordingly, if a faulty or inaccurate low-level annotation tool is to be used by other higher-level one in its process, the errors and inaccuracies of the former should be minimised in advance. Otherwise, these errors and inaccuracies would be transferred to (and even magnified in) the annotations of the high-level annotation tool.
Therefore, it would be quite useful to find a way to
(i) correct or, at least, reduce the errors and the inaccuracies of lower-level linguistic tools;
(ii) unify the annotation schemas of different linguistic annotation tools or, more generally speaking, make these tools (as well as their annotations) interoperate.
Clearly, solving (i) and (ii) should ease the automatic annotation of web pages by means of linguistic tools, and their transformation into Semantic Web pages (Berners-Lee, Hendler and Lassila, 2001). Yet, as stated above, (ii) is a type of interoperability problem. There again, ontologies (Gruber, 1993; Borst, 1997) have been successfully applied thus far to solve several interoperability problems. Hence, ontologies should help solve also the problems and limitations of linguistic annotation tools aforementioned.
Thus, to summarise, the main aim of the present work was to combine somehow these separated approaches, mechanisms and tools for annotation from Linguistics and Ontological Engineering (and the Semantic Web) in a sort of hybrid (linguistic and ontological) annotation model, suitable for both areas. This hybrid (semantic) annotation model should (a) benefit from the advances, models, techniques, mechanisms and tools of these two areas; (b) minimise (and even solve, when possible) some of the problems found in each of them; and (c) be suitable for the Semantic Web. The concrete goals that helped attain this aim are presented in the following section.
2. GOALS OF THE PRESENT WORK
As mentioned above, the main goal of this work was to specify a hybrid (that is, linguistically-motivated and ontology-based) model of annotation suitable for the Semantic Web (i.e. it had to produce a semantic annotation of web page contents). This entailed that the tags included in the annotations of the model had to (1) represent linguistic concepts (or linguistic categories, as they are termed in ISO/DCR (2008)), in order for this model to be linguistically-motivated; (2) be ontological terms (i.e., use an ontological vocabulary), in order for the model to be ontology-based; and (3) be structured (linked) as a collection of ontology-based
Resumo:
This paper describes the application of language translation technologies for generating bus information in Spanish Sign Language (LSE: Lengua de Signos Española). In this work, two main systems have been developed: the first for translating text messages from information panels and the second for translating spoken Spanish into natural conversations at the information point of the bus company. Both systems are made up of a natural language translator (for converting a word sentence into a sequence of LSE signs), and a 3D avatar animation module (for playing back the signs). For the natural language translator, two technological approaches have been analyzed and integrated: an example-based strategy and a statistical translator. When translating spoken utterances, it is also necessary to incorporate a speech recognizer for decoding the spoken utterance into a word sequence, prior to the language translation module. This paper includes a detailed description of the field evaluation carried out in this domain. This evaluation has been carried out at the customer information office in Madrid involving both real bus company employees and deaf people. The evaluation includes objective measurements from the system and information from questionnaires. In the field evaluation, the whole translation presents an SER (Sign Error Rate) of less than 10% and a BLEU greater than 90%.
Resumo:
The time delay of arrival (TDOA) between multiple microphones has been used since 2006 as a source of information (localization) to complement the spectral features for speaker diarization. In this paper, we propose a new localization feature, the intensity channel contribution (ICC) based on the relative energy of the signal arriving at each channel compared to the sum of the energy of all the channels. We have demonstrated that by joining the ICC features and the TDOA features, the robustness of the localization features is improved and that the diarization error rate (DER) of the complete system (using localization and spectral features) has been reduced. By using this new localization feature, we have been able to achieve a 5.2% DER relative improvement in our development data, a 3.6% DER relative improvement in the RT07 evaluation data and a 7.9% DER relative improvement in the last year's RT09 evaluation data.
Resumo:
Due to the fact that a metro network market is very cost sensitive, direct modulated schemes appear attractive. In this paper a CWDM (Coarse Wavelength Division Multiplexing) system is studied in detail by means of an Optical Communication System Design Software; a detailed study of the modulated current shape (exponential, sine and gaussian) for 2.5 Gb/s CWDM Metropolitan Area Networks is performed to evaluate its tolerance to linear impairments such as signal-to-noise-ratio degradation and dispersion. Point-to-point links are investigated and optimum design parameters are obtained. Through extensive sets of simulation results, it is shown that some of these shape pulses are more tolerant to dispersion when compared with conventional gaussian shape pulses. In order to achieve a low Bit Error Rate (BER), different types of optical transmitters are considered including strongly adiabatic and transient chirp dominated Directly Modulated Lasers (DMLs). We have used fibers with different dispersion characteristics, showing that the system performance depends, strongly, on the chosen DML?fiber couple.
Resumo:
This paper proposes the use of Factored Translation Models (FTMs) for improving a Speech into Sign Language Translation System. These FTMs allow incorporating syntactic-semantic information during the translation process. This new information permits to reduce significantly the translation error rate. This paper also analyses different alternatives for dealing with the non-relevant words. The speech into sign language translation system has been developed and evaluated in a specific application domain: the renewal of Identity Documents and Driver’s License. The translation system uses a phrase-based translation system (Moses). The evaluation results reveal that the BLEU (BiLingual Evaluation Understudy) has improved from 69.1% to 73.9% and the mSER (multiple references Sign Error Rate) has been reduced from 30.6% to 24.8%.
Resumo:
This article evaluates an authentication technique for mobiles based on gestures. Users create a remindful identifying gesture to be considered as their in-air signature. This work analyzes a database of 120 gestures of different vulnerability, obtaining an Equal Error Rate (EER) of 9.19% when robustness of gestures is not verified. Most of the errors in this EER come from very simple and easily forgeable gestures that should be discarded at enrollment phase. Therefore, an in-air signature robustness verification system using Linear Discriminant Analysis is proposed to infer automatically whether the gesture is secure or not. Different configurations have been tested obtaining a lowest EER of 4.01% when 45.02% of gestures were discarded, and an optimal compromise of EER of 4.82% when 19.19% of gestures were automatically rejected.
Resumo:
The problem of recurring concepts in data stream classification is a special case of concept drift where concepts may reappear. Although several existing methods are able to learn in the presence of concept drift, few consider contextual information when tracking recurring concepts. Nevertheless, in many real-world scenarios context information is available and can be exploited to improve existing approaches in the detection or even anticipation of recurring concepts. In this work, we propose the extension of existing approaches to deal with the problem of recurring concepts by reusing previously learned decision models in situations where concepts reappear. The different underlying concepts are identified using an existing drift detection method, based on the error-rate of the learning process. A method to associate context information and learned decision models is proposed to improve the adaptation to recurring concepts. The method also addresses the challenge of retrieving the most appropriate concept for a particular context. Finally, to deal with situations of memory scarcity, an intelligent strategy to discard models is proposed. The experiments conducted so far, using synthetic and real datasets, show promising results and make it possible to analyze the trade-off between the accuracy gains and the learned models storage cost.
Resumo:
Information reconciliation is a crucial procedure in the classical post-processing of quantum key distribution (QKD). Poor reconciliation e?ciency, revealing more information than strictly needed, may compromise the maximum attainable distance, while poor performance of the algorithm limits the practical throughput in a QKD device. Historically, reconciliation has been mainly done using close to minimal information disclosure but heavily interactive procedures, like Cascade, or using less e?cient but also less interactive ?just one message is exchanged? procedures, like the ones based in low-density parity-check (LDPC) codes. The price to pay in the LDPC case is that good e?ciency is only attained for very long codes and in a very narrow range centered around the quantum bit error rate (QBER) that the code was designed to reconcile, thus forcing to have several codes if a broad range of QBER needs to be catered for. Real world implementations of these methods are thus very demanding, either on computational or communication resources or both, to the extent that the last generation of GHz clocked QKD systems are ?nding a bottleneck in the classical part. In order to produce compact, high performance and reliable QKD systems it would be highly desirable to remove these problems. Here we analyse the use of short-length LDPC codes in the information reconciliation context using a low interactivity, blind, protocol that avoids an a priori error rate estimation. We demonstrate that 2×103 bits length LDPC codes are suitable for blind reconciliation. Such codes are of high interest in practice, since they can be used for hardware implementations with very high throughput.
Resumo:
Although most of the research on Cognitive Radio is focused on communication bands above the HF upper limit (30 MHz), Cognitive Radio principles can also be applied to HF communications to make use of the extremely scarce spectrum more efficiently. In this work we consider legacy users as primary users since these users transmit without resorting to any smart procedure, and our stations using the HFDVL (HF Data+Voice Link) architecture as secondary users. Our goal is to enhance an efficient use of the HF band by detecting the presence of uncoordinated primary users and avoiding collisions with them while transmitting in different HF channels using our broad-band HF transceiver. A model of the primary user activity dynamics in the HF band is developed in this work to make short-term predictions of the sojourn time of a primary user in the band and avoid collisions. It is based on Hidden Markov Models (HMM) which are a powerful tool for modelling stochastic random processes and are trained with real measurements of the 14 MHz band. By using the proposed HMM based model, the prediction model achieves an average 10.3% prediction error rate with one minute-long channel knowledge but it can be reduced when this knowledge is extended: with the previous 8 min knowledge, an average 5.8% prediction error rate is achieved. These results suggest that the resulting activity model for the HF band could actually be used to predict primary users activity and included in a future HF cognitive radio based station.
Resumo:
The aim of this study was to examine the effect of positioning on the correctness of decision making of top-class referees and assistant referees during international games. Match analyses were carried out during the Fe´de´ration Internationale de Football Association (FIFA) Confederations Cup 2009 and 380 foul play incidents and 165 offside situations were examined. The error percentage for the referees when indicating the incidents averaged 14%. The lowest error percentage occurred in the central area of the field, where the collaboration of the assistant referee is limited, and was achieved when indicating the incidents from a distance of 11–15 m, whereas this percentage peaked (23%) in the last 15-min match period. The error rate for the assistant referees was 13%. Distance of the assistant referee to the offside line did not have an impact on the quality of the offside decision. The risk of making incorrect decisions was reduced when the assistant referees viewed the offside situations from an angle between 46 and 608. Incorrect offside decisions occurred twice as often in the second as in the first half of the games. Perceptual-cognitive training sessions specific to the requirements of the game should be implemented in the weekly schedule of football officials to reduce the overall error rate.
Resumo:
The understanding of the embryogenesis in living systems requires reliable quantitative analysis of the cell migration throughout all the stages of development. This is a major challenge of the "in-toto" reconstruction based on different modalities of "in-vivo" imaging techniques -spatio-temporal resolution and image artifacts and noise. Several methods for cell tracking are available, but expensive manual interaction -time and human resources- is always required to enforce coherence. Because of this limitation it is necessary to restrict the experiments or assume an uncontrolled error rate. Is it possible to obtain automated reliable measurements of migration? can we provide a seed for biologists to complete cell lineages efficiently? We propose a filtering technique that considers trajectories as spatio-temporal connected structures that prunes out those that might introduce noise and false positives by using multi-dimensional morphological operators.
Resumo:
En este proyecto realizaremos un estudio del efecto de las interferencias procedentes de las redes públicas y veremos cómo afectan el rendimiento de las comunicaciones GSM-R que están en la banda de frecuencias adyacente, por un lado, definiremos las características de las redes públicas y como afectan los niveles de potencia y los anchos de banda de redes de banda ancha, especialmente LTE que dispone de un ancho de banda adaptativo que puede llegar hasta 20 MHZ, y por otro lado definiremos las características y las exigencias de las comunicaciones GSM-R que es una red privada que se utiliza actualmente para comunicaciones ferroviales. Con el objetivo de determinar el origen y los motivos de estas interferencias vamos a explicar cómo se produzcan las emisiones no deseadas de las redes públicas que son fruto de la intermodulación que se produzca por las características no lineales de los amplificadores, entre las emisiones no deseadas se puede diferenciar entre el dominio de los espurios y el dominio de las emisiones fuera de banda, para determinar el nivel de las emisiones fuera de banda definiremos la relación de fugas del canal adyacente, ACLR, que determina la diferencia entre el pico de la señal deseada y el nivel de señal interferente en la banda de paso. Veremos cómo afectan estas emisiones no deseadas a las comunicaciones GSMR en el caso de interferencias procedentes de señales de banda estrecha, como es el caso de GSM, y como afectan en el caso de emisiones de banda ancha con los protocolos UMTS y LTE, también estudiaremos como varia el rendimiento de la comunicación GSM-R frente a señales LTE de diferentes anchos de banda. Para reducir el impacto de las interferencias sobre los receptores GSM-R, analizaremos el efecto de los filtros de entrada de los receptores GSM-R y veremos cómo varia la BER y la ACLR. Además, con el objetivo de evaluar el rendimiento del receptor GSM-R ante diferentes tipos de interferencias, simularemos dos escenarios donde la red GSM-R se verá afectada por las interferencias procedente de una estación base de red pública, en el primer escenario la distancia entre la BS y MS GSM-R será de 4.6 KM, mientras en el segundo escenario simularemos una situación típica cuando un tren está a una distancia corta (25 m) de la BS de red pública. Finalmente presentaremos los resultados en forma de graficas de BER y ACLR, y tablas indicando los diferentes niveles de interferencias y la diferencia entre la potencia a la que obtenemos un valor óptimo de BER, 10-3, sin interferencia y la potencia a la que obtenemos el mismo valor con interferencias. ABSTRACT In this project we will study the interference effect from public networks and how they affect the performance of GSM-R communications that are in the adjacent frequency band, furthermore, we will define the characteristics of public networks and will explain how the power levels and bandwidth broadband networks are affected as a result, especially LTE with adaptive bandwidth that can reach 20 MHZ. Lastly, we will define the characteristics and requirements of the GSM-R communications, a private network that is currently used for railways communications. In order to determine the origin and motives of these interferences, we will explain what causes unwanted emissions of public networks that occur as a result. The intermodulation, which is caused by the nonlinear characteristics of amplifiers. Unwanted emissions from the transmitter are divided into OOB (out-of-band) emission and spurious emissions. The OOB emissions are defined by an Adjacent Channel Leakage Ratio (ACLR) requirement. We'll analyze the effect of the OOB emission on the GSM-R communication in the case of interference from narrowband signals such as GSM, and how they affect emissions in the case of broadband such as UMTS and LTE; also we will study how performance varies with GSM-R versus LTE signals of different bandwidths. To reduce the impact of interference on the GSM-R receiver, we analyze the effect of input filters GSM-R receivers to see how it affects the BER (Bits Error Rate) and ACLR. To analyze the GSM-R receiver performance in this project, we will simulate two scenarios when the GSM-R will be affected by interference from a base station (BS). In the first case the distance between the public network BS and MS GSM-R is 4.6 KM, while the second case simulates a typical situation when a train is within a short distance, 25 m, of a public network BS. Finally, we will present the results as BER and ACLR graphs, and tables showing different levels of interference and the differences between the power to obtain an optimal value of BER, 10-3, without interference, and the power that gets the same value with interference.
Resumo:
El vertiginoso avance de la informática y las telecomunicaciones en las últimas décadas ha incidido invariablemente en la producción y la prestación de servicios, en la educación, en la industria, en la medicina, en las comunicaciones e inclusive en las relaciones interpersonales. No obstante estos avances, y a pesar de la creciente aportación del software al mundo actual, durante su desarrollo continuamente se incurre en el mismo tipo de problemas que provocan un retraso sistemático en los plazos de entrega, se exceda en presupuesto, se entregue con una alta tasa de errores y su utilidad sea inferior a la esperada. En gran medida, esta problemática es atribuible a defectos en los procesos utilizados para recoger, documentar, acordar y modificar los requisitos del sistema. Los requisitos son los cimientos sobre los cuáles se construye un producto software, y sin embargo, la incapacidad de gestionar sus cambios es una de las principales causas por las que un producto software se entrega fuera de tiempo, se exceda en coste y no cumpla con la calidad esperada por el cliente. El presente trabajo de investigación ha identificado la necesidad de contar con metodologías que ayuden a desplegar un proceso de Gestión de Requisitos en pequeños grupos y entornos de trabajo o en pequeñas y medianas empresas. Para efectos de esta tesis llamaremos Small-Settings a este tipo de organizaciones. El objetivo de este trabajo de tesis doctoral es desarrollar un metamodelo que permita, por un lado, la implementación y despliegue del proceso de Gestión de Requisitos de forma natural y a bajo coste y, por otro lado, el desarrollo de mecanismos para la mejora continua del mismo. Este metamodelo esta soportado por el desarrollo herramientas que permiten mantener una biblioteca de activos de proceso para la Gestión de Requisitos y a su vez contar con plantillas para implementar el proceso partiendo del uso de activos previamente definidos. El metamodelo contempla el desarrollo de prácticas y actividades para guiar, paso a paso, la implementación del proceso de Gestión de Requisitos para una Small-Setting utilizando un modelo de procesos como referencia y una biblioteca de activos de proceso como principal herramienta de apoyo. El mantener los activos de proceso bien organizados, indexados, y fácilmente asequibles, facilita la introducción de las mejores prácticas al interior de una organización. ABSTRACT The fast growth of computer science and telecommunication in recent decades has invariably affected the provision of products and services in education, industry, healthcare, communications and also interpersonal relationships. In spite of such progress and the active role of the software in the world, its development and production continually incurs in the same type of problems that cause systematic delivery delays, over budget, a high error rate and consequently its use is lower than expected. These problems are largely attributed to defects in the processes used to identify, document, organize, and track all system's requirements. It is generally accepted that requirements are the foundation upon which the software process is built, however, the inability to manage changes in requirements is one of the principal factors that contribute to delays on the software development process, which in turn, may cause customer dissatisfaction. The aim of the present research work has identified the need for appropriate methodologies to help on the requirement management process for those organizations that are categorised as small and medium size enterprises, small groups within large companies, or small projects. For the purposes of this work, these organizations are named Small-Settings. The main goal of this research work is to develop a metamodel to manage the requirement process using a Process Asset Library (PAL) and to provide predefined tools and actives to help on the implementation process. The metamodel includes the development of practices and activities to guide step by step the deployment of the requirement management process in Small-Settings. Keeping assets organized, indexed, and readily available are a main factor to the success of the organization process improvement effort and facilitate the introduction of best practices within the organization. The Process Asset Library (PAL) will become a repository of information used to keep and make available all process assets that are useful to those who are defining, implementing, and managing processes in the organization.