30 resultados para low rate speech coding
em Universidad Politécnica de Madrid
Resumo:
We present a novel framework for encoding latency analysis of arbitrary multiview video coding prediction structures. This framework avoids the need to consider an specific encoder architecture for encoding latency analysis by assuming an unlimited processing capacity on the multiview encoder. Under this assumption, only the influence of the prediction structure and the processing times have to be considered, and the encoding latency is solved systematically by means of a graph model. The results obtained with this model are valid for a multiview encoder with sufficient processing capacity and serve as a lower bound otherwise. Furthermore, with the objective of low latency encoder design with low penalty on rate-distortion performance, the graph model allows us to identify the prediction relationships that add higher encoding latency to the encoder. Experimental results for JMVM prediction structures illustrate how low latency prediction structures with a low rate-distortion penalty can be derived in a systematic manner using the new model.
Resumo:
This article presents an alternative approach to the decision-making process in transport strategy design. The study explores the possibility of integrating forecasting, assessment and optimization procedures in support of a decision-making process designed to reach the best achievable scenario through mobility policies. Long-term evaluation, as required by a dynamic system such as a city, is provided by a strategic Land-Use and Transport Interaction (LUTI) model. The social welfare achieved by implementing mobility LUTI model policies is measured through a cost-benefit analysis and maximized through an optimization process throughout the evaluation period. The method is tested by optimizing a pricing policy scheme in Madrid on a cordon toll in a context requiring system efficiency, social equity and environmental quality. The optimized scheme yields an appreciable increase in social surplus through a relatively low rate compared to other similar pricing toll schemes. The results highlight the different considerations regarding mobility impacts on the case study area, as well as the major contributors to social welfare surplus. This leads the authors to reconsider the cost-analysis approach, as defined in the study, as the best option for formulating sustainability measures.
Resumo:
Remote reprogramming capabilities are one of the major concerns in WSN platforms due to the limitations and constraints that low power wireless nodes poses, especially when energy efficiency during the reprogramming process is a critical factor for extending the battery life of the devices. Moreover, WSNs are based on low-rate protocols in which as greater the amount of data is sent, the more the possibility to lose packets during the transmitting process is. In order to overcome these limitations, in this work a novel on-the-fly reprogramming technique for modifying and updating the application running on the wireless sensor nodes is designed and implemented, based on a partial reprogramming mechanism that significantly reduces the size of the files to be downloaded to the nodes, therefore diminishing their power/time consumption. This powerful mechanism also addresses multi-experimental capabilities because it provides the possibility to download, manage, test and debug multiple applications into the wireless nodes, based on a memory map segmentation of the core. Being an on-the-fly reprogramming process, no additional resources to store and download the configuration file are needed.
Resumo:
This article presents an alternative approach to the decision-making process in transport strategy design. The study explores the possibility of integrating forecasting, assessment and optimization procedures in support of a decision-making process designed to reach the best achievable scenario through mobility policies. Long-term evaluation, as required by a dynamic system such as a city, is provided by a strategic Land-Use and Transport Interaction (LUTI) model. The social welfare achieved by implementing mobility LUTI model policies is measured through a cost-benefit analysis and maximized through an optimization process throughout the evaluation period. The method is tested by optimizing a pricing policy scheme in Madrid on a cordon toll in a context requiring system efficiency, social equity and environmental quality. The optimized scheme yields an appreciable increase in social surplus through a relatively low rate compared to other similar pricing toll schemes. The results highlight the different considerations regarding mobility impacts on the case study area, as well as the major contributors to social welfare surplus. This leads the authors to reconsider the cost-analysis approach, as defined in the study, as the best option for formulating sustainability measures.
Resumo:
Esta investigación surge a raíz de la experiencia profesional del autor, maestro especialista de Educación Física en el C.E.I.P. “Alhambra” de Madrid, cuando de manera progresiva, aprecia que el tenis de mesa puede ser un deporte muy interesante de desarrollar en las sesiones de Educación Física y de promover dentro de los tiempos de recreo. El autor cree que este deporte desarrolla una serie de objetivos motrices, afectivos, cognitivos y sociales que pueden contribuir a la adquisición de las competencias básicas y al desarrollo integral de los alumnos. Es entonces cuando recibe formación sobre el deporte de tenis de mesa y busca los medios necesarios de financiación para que se dote al centro del material necesario. Así la Junta municipal del distrito de Fuencarral-El Pardo instala en el patio del colegio tres mesas de exterior y, con los recursos del colegio y la ayuda de la Asociación de padres y madres (AMPA), se consiguen cinco mesas de interior plegables y todo el material necesario (redes, raquetas, pelotas, etc.). Tras introducir este deporte desde 3º a 6º de Educación Primaria promueve un campeonato en el colegio cuyo índice de participación ronda el 90% del alumnado, estos resultados crean al autor ciertas incertidumbres que son la motivación y punto de partida para realizar esta investigación que analice si la práctica del tenis de mesa puede resultar idónea en la etapa de Educación Primaria. Introducción La legislación actual en materia de educación, Ley Orgánica 2/2006, de 3 de mayo, de Educación (LOE) modificada por la Ley Orgánica 8/2013, de 9 de diciembre, para la mejora de la calidad educativa. (LOMCE), otorga una gran relevancia al deporte en general. "El deporte es una actividad saludable, divertida y formativa que puede tener profundos beneficios no sólo para su salud y su bienestar sino también para el desarrollo personal integral físico, psicológico y psicosocial del niño, además de sobre su desarrollo deportivo" (Pradas, 2009, p. 151), es pues, un momento idóneo para analizar qué deportes se practican en los colegios o por qué se practican unos más que otros. "El tenis de mesa además de ser un deporte para todos, se presenta como un juego atractivo, en donde su práctica resulta muy divertida a cualquier edad, tanto para niños como para adultos, principalmente porque presenta unas reglas de juego simples, no encerrando peligro alguno para la integridad física de sus practicantes durante su juego" (Pradas, 2009, p. 83). Es un deporte que "está abierto a todos, sin distinción de edad o sexo, tanto como deporte de alto nivel como de práctica familiar o social" (Gatien, 1993, p. 16). No obstante, "son escasas las obras sobre tenis de mesa. Pocos libros, tanto de divulgación como de reflexión sobre el tenis de mesa, adornan los estantes de las librerías y las bibliotecas" (Erb, 1999, p.14) y añade “así pues, el medio escolar padece de falta de obras explicativas y pedagógicas referidas a este tema" (Erb, 1999, p.14 ) En particular, se pretenden conseguir cinco objetivos divididos en tres categorías (el centro, el profesorado y el deporte. • A nivel de Centro: - Conocer el porcentaje de colegios que disponen de espacios y materiales adecuados para la práctica del tenis de mesa, así como identificar, de las distintas Direcciones de Área Territoriales (DAT), cuál tiene los colegios mejor dotados tanto en instalaciones como en materiales para desarrollar programas de promoción del tenis de mesa. - Averiguar las posibles causas por las que el tenis de mesa no se practica tanto como otros deportes, analizando los impedimentos que limitan la implantación del tenis de mesa como un deporte habitual en los centros de Educación Primaria. Analizar la opinión del profesorado en cuanto a los materiales y las instalaciones necesarios para el tenis de mesa. • A nivel de profesorado: - Analizar el nivel de conocimiento que tienen los profesionales que imparten la asignatura de Educación Física sobre el tenis de mesa, así como sus necesidades para incluir unidades didácticas de tenis de mesa en sus programaciones didácticas. - Conocer el perfil de profesor ideal que recomienda la utilización del tenis de mesa y averiguar el interés del profesorado por recibir formación específica del tenis de mesa. • A nivel de deporte: - Analizar la opinión de los profesionales sobre la idoneidad del tenis de mesa en la Educación Primaria atendiendo a los objetivos que persigue, a las competencias que desarrolla, a los contenidos, criterios de evaluación y estándares de aprendizaje que se pueden trabajar y a las lesiones que se producen. Metodología La investigación se caracterizó por utilizar una metodología inductiva, al surgir de la experiencia profesional del autor, también fue transversal al analizar la realidad en un momento concreto y de tipo cuantitativa. La población objeto de estudio fue la totalidad de los colegios públicos de la Comunidad de Madrid, siendo los profesores de Educación Física los encargados de facilitar los datos solicitados. Estos datos se obtuvieron utilizando como instrumento de toma de datos el cuestionario auto administrado con preguntas cerradas de opción múltiple previamente validado por un panel de 5 expertos. Las variables indirectas fueron: el género del profesorado, la edad del profesorado, la experiencia profesional y el tipo de destino. El proceso de la toma de datos supuso un lapso de tiempo de 3 meses, desde mayo de 2015 hasta julio de 2015, en este tiempo hubo dos fases de recogida de datos, una online a través del correo electrónico institucional de los colegios públicos de la Comunidad de Madrid y otra “in situ” con cuestionarios de lápiz y papel. En cuanto a los datos que se obtuvieron, sobre una población de 798 colegios, se consiguió una muestra de 276, esto supuso una tasa de respuesta del 34,59%, asumiendo la situación más desfavorable posible (p=q) y un nivel de confianza del 95%, para el total de los 276 cuestionarios cumplimentados, el error máximo fue del ±4,78%. Resultados En cuanto a los resultados obtenidos, se establecieron de acuerdo a tres dimensiones: A nivel de Centro, a nivel de Profesorado y a nivel del Deporte y pretendieron averiguar si se alcanzaron los cinco objetivos planteados. Tras el análisis de los resultados, se apreció que los colegios públicos de la Comunidad de Madrid disponían de las suficientes instalaciones para el tenis de mesa, en cambio, faltaban materiales específicos y formación por parte del profesorado, así como recursos didácticos y un programa de promoción del tenis de mesa. Se apreció un manifiesto interés por parte del profesorado en recibir formación específica de tenis de mesa pues la mayoría recomendaba la utilización del tenis de mesa dentro de la asignatura de Educación Física en Educación Primaria. Por último, los resultados mostraron la cantidad de objetivos motrices, afectivos, cognitivos y sociales que desarrolla el tenis de mesa así como su contribución a la adquisición de las competencias básicas y al objetivo “k” de la Educación Primaria, que indica “Valorar la higiene y la salud, conocer y respetar el cuerpo humano, y utilizar la Educación Física y el deporte como medios para favorecer el desarrollo personal y social”, además, se mostró el bajo índice de lesiones que provoca. Discusión y conclusiones El tenis de mesa es un deporte idóneo para ser practicado y enseñado en la asignatura de Educación Física en la etapa de Educación Primaria debido a la gran cantidad de contenidos que son susceptibles de ser trabajados a través de este deporte y debido a la gran cantidad de valores, individuales y sociales que se pueden fomentar con la práctica del tenis de mesa. Las causas de que hasta ahora, el tenis de mesa no sea un deporte practicado de forma habitual en los colegios públicos de la Comunidad de Madrid a pesar de trabajar muchos contenidos específicos de la asignatura de Educación Física puede deberse a factores externos al deporte del tenis de mesa y susceptibles de ser solucionados con una adecuada inversión en materiales específicos, formación del profesorado y recursos didácticos. Si se dota a los centros de los materiales y recursos didácticos necesarios y dando formación al profesorado, éste introduciría unidades didácticas de tenis de mesa dentro de sus programaciones anuales. La federación española y madrileña de tenis de mesa, deberían desarrollar un programa de promoción dotando de materiales y recursos a los centros, tal y como lo han hecho otras federaciones como la de voleibol, bádminton o de baloncesto, entre otras. ABSTRACT This research arises from the professional experience of the author, specialized teacher of physical education in the CEIP "Alhambra" in Madrid, where progressively, appreciates that table tennis can be a very interesting sport to develop in physical education sessions and promote within the playtimes. The author believes that this sport develops a range of motor, affective, cognitive and social objectives that can contribute to the acquisition of basic skills and the integral development of students. It is then when receives training on the sport of table tennis and seeks ways of funding in order to outfit the center with necessary equipment. The Municipal District of Fuencarral-El Pardo installed three outdoor tables in the schoolyard and with the resources of the school and the support of the Association of Parents (AMPA), five indoor folding tables are achieved as well as all the necessary material (nets, rackets, balls, etc.). After introduce the sport from 3rd to 6th grade of primary education, promotes a championship in the school where the participation rate is around 90% of students, these results create the uncertainties to the author that are the motivation and starting point for this research to analyze whether the practice of table tennis can be ideal at the stage of primary education. Introduction The current legislation on education, Organic Law 2/2006 of 3 May, on Education (LOE) as amended by Organic Law 8/2013, of December 9, to improve educational quality (LOMCE), attaches great importance to the sport in general, "Sport is a healthy, funny and educational activity that can have great benefits not only for their health and well-being but also for the physical, psychological and psychosocial comprehensive personal child development besides on their sports development "(Pradas, 2009, p. 151), is therefore an ideal moment to analyze which sports are practiced in schools or why are practiced some more than others. "The table tennis as well as being a sport for everyone, is presented as an attractive game, where its practice is funny at any age, both children and adults, mainly because it has simple game rules, not enclosing danger for the physical integrity of its practitioners during their game" (Pradas, 2009, p. 83). It is a sport that is "open to all, regardless of age or sex, as high-level sport, as family or social practice" (Gatien, 1993, p. 16). However, "there are few books on table tennis. Few books, both reflexion or popularization about table tennis, adorn the shelves of bookstores and libraries." (Erb, 1999, p.14) and add "So, the school environment suffers from lack of explanatory and educational work related to this issue." (Erb, 1999, p.14) In particular, it is intended to achieve the following objectives within the Community of Madrid: • To determine the percentage of schools that have spaces and materials suitable for practicing table tennis and identify, from the different Directorates of Land Area (DAT), which has the best equipped schools in both facilities and materials to develop programs to promote table tennis. • Find out the possible causes that explained why table tennis is not practiced as much as other sports, analyzing impediments that limit the implementation of table tennis as a regular sport in primary schools. Analyze the opinion of teachers in terms of materials and facilities needed for table tennis. • Analyze the level of knowledge about table tennis among professionals who teach the subject of Physical Education and their needs to include teaching units about table tennis in their teaching programs. • Knowing the profile of the ideal teacher who recommends the use of table tennis and figure out the interest of teachers to receive specific training of table tennis. • Analyze the professional opinion on the suitability of table tennis in Primary Education taking into account the objectives pursued, to develop the skills, content, evaluation criteria and learning standards that can work and injuries involved. Methodology The investigation was characterized by using an inductive methodology, arising from the professional experience of the author, was also transverse to analyze reality in a particular time and quantitative type. The population under study were all the state schools in Madrid region, being the physical education teachers responsible for providing the requested data. These data were obtained using as data collection instrument a self-administered questionnaire with multiple choice questions, because it facilitates the analysis thereof. In terms of obtained data, on a population of 798 schools, a sample of 276 was achieved, this represented a response rate of 34.59%, assuming the worst case scenario (p = q) and a level 95% confidence for the total of the 276 completed questionnaires, the maximum error was ± 4.78%. Results In terms of the results, they were set according to three dimensions: center level, professorate level and Sport level and trying to find out whether the five objectives were achieved. After analyzing the results, it was found that schools possessed sufficient facilities for table tennis, however, lacked specific materials and training by teachers, as well as teaching resources and a program to promote table tennis. A clear interest was noticed by teachers in order to receive specific training in table tennis since most recommended the use of table tennis in the subject of physical education in primary education. Finally, the results proved the number of motor, affective, cognitive and social objectives developed by table tennis and its contribution to the acquisition of basic skills and the objective "k" of primary education, in addition to the low rate of injury it causes. Discussion and conclusions Table tennis is an ideal sport to be practiced and taught in the subject of Physical Education in Primary Education due to the large amount of content that are likely to be worked through this sport and due to the large number of individual and social values that can foster the practice of table tennis. The causes of that so far, table tennis is not a sport practiced regularly in schools despite working many specific contents of the subject of Physical Education may be due to factors outside the sport of table tennis and subject to solved with adequate investment in specific materials, teacher training and educational resources. By endowing the centers with the necessary teaching materials and resources and providing training to teachers, they would introduce teaching units of table tennis within their annual programs. Madrid and the Spanish Federation of table tennis should develop a promotional program by endowing materials and resources to the centers, as did other federations such as badminton and basketball, among others.
Resumo:
El habla es la principal herramienta de comunicación de la que dispone el ser humano que, no sólo le permite expresar su pensamiento y sus sentimientos sino que le distingue como individuo. El análisis de la señal de voz es fundamental para múltiples aplicaciones como pueden ser: síntesis y reconocimiento de habla, codificación, detección de patologías, identificación y reconocimiento de locutor… En el mercado se pueden encontrar herramientas comerciales o de libre distribución para realizar esta tarea. El objetivo de este Proyecto Fin de Grado es reunir varios algoritmos de análisis de la señal de voz en una única herramienta que se manejará a través de un entorno gráfico. Los algoritmos están siendo utilizados en el Grupo de investigación en Aplicaciones MultiMedia y Acústica de la Universidad Politécnica de Madrid para llevar a cabo su tarea investigadora y para ofertar talleres formativos a los alumnos de grado de la Escuela Técnica Superior de Ingeniería y Sistemas de Telecomunicación. Actualmente se ha encontrado alguna dificultad para poder aplicar los algoritmos ya que se han ido desarrollando a lo largo de varios años, por distintas personas y en distintos entornos de programación. Se han adaptado los programas existentes para generar una única herramienta en MATLAB que permite: . Detección de voz . Detección sordo/sonoro . Extracción y revisión manual de frecuencia fundamental de los sonidos sonoros . Extracción y revisión manual de formantes de los sonidos sonoros En todos los casos el usuario puede ajustar los parámetros de análisis y se ha mantenido y, en algunos casos, ampliado la funcionalidad de los algoritmos existentes. Los resultados del análisis se pueden manejar directamente en la aplicación o guardarse en un fichero. Por último se ha escrito el manual de usuario de la aplicación y se ha generado una aplicación independiente que puede instalarse y ejecutarse aunque no se disponga del software o de la versión adecuada de MATLAB. ABSTRACT. The speech is the main communication tool which has the human that as well as allowing to express his thoughts and feelings distinguishes him as an individual. The analysis of speech signal is essential for multiple applications such as: synthesis and recognition of speech, coding, detection of pathologies, identification and speaker recognition… In the market you can find commercial or open source tools to perform this task. The aim of this Final Degree Project is collect several algorithms of speech signal analysis in a single tool which will be managed through a graphical environment. These algorithms are being used in the research group Aplicaciones MultiMedia y Acústica at the Universidad Politécnica de Madrid to carry out its research work and to offer training workshops for students at the Escuela Técnica Superior de Ingeniería y Sistemas de Telecomunicación. Currently some difficulty has been found to be able to apply the algorithms as they have been developing over several years, by different people and in different programming environments. Existing programs have been adapted to generate a single tool in MATLAB that allows: . Voice Detection . Voice/Unvoice Detection . Extraction and manual review of fundamental frequency of voiced sounds . Extraction and manual review formant voiced sounds In all cases the user can adjust the scan settings, we have maintained and in some cases expanded the functionality of existing algorithms. The analysis results can be managed directly in the application or saved to a file. Finally we have written the application user’s manual and it has generated a standalone application that can be installed and run although the user does not have MATLAB software or the appropriate version.
Resumo:
Puncturing is a well-known coding technique widely used for constructing rate-compatible codes. In this paper, we consider the problem of puncturing low-density parity-check codes and propose a new algorithm for intentional puncturing. The algorithm is based on the puncturing of untainted symbols, i.e. nodes with no punctured symbols within their neighboring set. It is shown that the algorithm proposed here performs better than previous proposals for a range of coding rates and short proportions of punctured symbols.
Resumo:
La presente Tesis analiza las posibilidades que ofrecen en la actualidad las tecnologías del habla para la detección de patologías clínicas asociadas a la vía aérea superior. El estudio del habla que tradicionalmente cubre tanto la producción como el proceso de transformación del mensaje y las señales involucradas, desde el emisor hasta alcanzar al receptor, ofrece una vía de estudio alternativa para estas patologías. El hecho de que la señal emitida no solo contiene este mensaje, sino también información acerca del locutor, ha motivado el desarrollo de sistemas orientados a la identificación y verificación de la identidad de los locutores. Estos trabajos han recibido recientemente un nuevo impulso, orientándose tanto hacia la caracterización de rasgos que son comunes a varios locutores, como a las diferencias existentes entre grabaciones de un mismo locutor. Los primeros resultan especialmente relevantes para esta Tesis dado que estos rasgos podrían evidenciar la presencia de características relacionadas con una cierta condición común a varios locutores, independiente de su identidad. Tal es el caso que se enfrenta en esta Tesis, donde los rasgos identificados se relacionarían con una de la patología particular y directamente vinculada con el sistema de físico de conformación del habla. El caso del Síndrome de Apneas Hipopneas durante el Sueno (SAHS) resulta paradigmático. Se trata de una patología con una elevada prevalencia mundo, que aumenta con la edad. Los pacientes de esta patología experimentan episodios de cese involuntario de la respiración durante el sueño, que se prolongan durante varios segundos y que se reproducen a lo largo de la noche impidiendo el correcto descanso. En el caso de la apnea obstructiva, estos episodios se deben a la imposibilidad de mantener un camino abierto a través de la vía aérea, de forma que el flujo de aire se ve interrumpido. En la actualidad, el diagnostico de estos pacientes se realiza a través de un estudio polisomnográfico, que se centra en el análisis de los episodios de apnea durante el sueño, requiriendo que el paciente permanezca en el hospital durante una noche. La complejidad y el elevado coste de estos procedimientos, unidos a las crecientes listas de espera, han evidenciado la necesidad de contar con técnicas rápidas de detección, que si bien podrían no obtener tasas tan elevadas, permitirían reorganizar las listas de espera en función del grado de severidad de la patología en cada paciente. Entre otros, los sistemas de diagnostico por imagen, así como la caracterización antropométrica de los pacientes, han evidenciado la existencia de patrones anatómicos que tendrían influencia directa sobre el habla. Los trabajos dedicados al estudio del SAHS en lo relativo a como esta afecta al habla han sido escasos y algunos de ellos incluso contradictorios. Sin embargo, desde finales de la década de 1980 se conoce la existencia de patrones específicos relativos a la articulación, la fonación y la resonancia. Sin embargo, su descripción resultaba difícilmente aprovechable a través de un sistema de reconocimiento automático, pero apuntaba la existencia de un nexo entre voz y SAHS. En los últimos anos las técnicas de procesado automático han permitido el desarrollo de sistemas automáticos que ya son capaces de identificar diferencias significativas en el habla de los pacientes del SAHS, y que los distinguen de los locutores sanos. Por contra, poco se conoce acerca de la conexión entre estos nuevos resultados, los sé que habían obtenido en el pasado y la patogénesis del SAHS. Esta Tesis continua la labor desarrollada en este ámbito considerando específicamente: el estudio de la forma en que el SAHS afecta el habla de los pacientes, la mejora en las tasas de clasificación automática y la combinación de la información obtenida con los predictores utilizados por los especialistas clínicos en sus evaluaciones preliminares. Las dos primeras tareas plantean problemas simbióticos, pero diferentes. Mientras el estudio de la conexión entre el SAHS y el habla requiere de modelos acotados que puedan ser interpretados con facilidad, los sistemas de reconocimiento se sirven de un elevado número de dimensiones para la caracterización y posterior identificación de patrones. Así, la primera tarea debe permitirnos avanzar en la segunda, al igual que la incorporación de los predictores utilizados por los especialistas clínicos. La Tesis aborda el estudio tanto del habla continua como del habla sostenida, con el fin de aprovechar las sinergias y diferencias existentes entre ambas. En el análisis del habla continua se tomo como punto de partida un esquema que ya fue evaluado con anterioridad, y sobre el cual se ha tratado la evaluación y optimización de la representación del habla, así como la caracterización de los patrones específicos asociados al SAHS. Ello ha evidenciado la conexión entre el SAHS y los elementos fundamentales de la señal de voz: los formantes. Los resultados obtenidos demuestran que el éxito de estos sistemas se debe, fundamentalmente, a la capacidad de estas representaciones para describir dichas componentes, obviando las dimensiones ruidosas o con poca capacidad discriminativa. El esquema resultante ofrece una tasa de error por debajo del 18%, sirviéndose de clasificadores notablemente menos complejos que los descritos en el estado del arte y de una única grabación de voz de corta duración. En relación a la conexión entre el SAHS y los patrones observados, fue necesario considerar las diferencias inter- e intra-grupo, centrándonos en la articulación característica del locutor, sustituyendo los complejos modelos de clasificación por el estudio de los promedios espectrales. El resultado apunta con claridad hacia ciertas regiones del eje de frecuencias, sugiriendo la existencia de un estrechamiento sistemático en la sección del tracto en la región de la orofaringe, ya prevista en la patogénesis de este síndrome. En cuanto al habla sostenida, se han reproducido los estudios realizados sobre el habla continua en grabaciones de la vocal /a/ sostenida. Los resultados son cualitativamente análogos a los anteriores, si bien en este caso las tasas de clasificación resultan ser más bajas. Con el objetivo de identificar el sentido de este resultado se reprodujo el estudio de los promedios espectrales y de la variabilidad inter e intra-grupo. Ambos estudios mostraron importantes diferencias con los anteriores que podrían explicar estos resultados. Sin embargo, el habla sostenida ofrece otras oportunidades al establecer un entorno controlado para el estudio de la fonación, que también había sido identificada como una fuente de información para la detección del SAHS. De su estudio se pudo observar que, en el conjunto de datos disponibles, no existen variaciones que pudieran asociarse fácilmente con la fonación. Únicamente aquellas dimensiones que describen la distribución de energía a lo largo del eje de frecuencia evidenciaron diferencias significativas, apuntando, una vez más, en la dirección de las resonancias espectrales. Analizados los resultados anteriores, la Tesis afronta la fusión de ambas fuentes de información en un único sistema de clasificación. Con ello es posible mejorar las tasas de clasificación, bajo la hipótesis de que la información presente en el habla continua y el habla sostenida es fundamentalmente distinta. Esta tarea se realizo a través de un sencillo esquema de fusión que obtuvo un 88.6% de aciertos en clasificación (tasa de error del 11.4%), lo que representa una mejora significativa respecto al estado del arte. Finalmente, la combinación de este clasificador con los predictores utilizados por los especialistas clínicos ofreció una tasa del 91.3% (tasa de error de 8.7%), que se encuentra dentro del margen ofrecido por esquemas más costosos e intrusivos, y que a diferencia del propuesto, no pueden ser utilizados en la evaluación previa de los pacientes. Con todo, la Tesis ofrece una visión clara sobre la relación entre el SAHS y el habla, evidenciando el grado de madurez alcanzado por la tecnología del habla en la caracterización y detección del SAHS, poniendo de manifiesto que su uso para la evaluación de los pacientes ya sería posible, y dejando la puerta abierta a futuras investigaciones que continúen el trabajo aquí iniciado. ABSTRACT This Thesis explores the potential of speech technologies for the detection of clinical disorders connected to the upper airway. The study of speech traditionally covers both the production process and post processing of the signals involved, from the speaker up to the listener, offering an alternative path to study these pathologies. The fact that utterances embed not just the encoded message but also information about the speaker, has motivated the development of automatic systems oriented to the identification and verificaton the speaker’s identity. These have recently been boosted and reoriented either towards the characterization of traits that are common to several speakers, or to the differences between records of the same speaker collected under different conditions. The first are particularly relevant to this Thesis as these patterns could reveal the presence of features that are related to a common condition shared among different speakers, regardless of their identity. Such is the case faced in this Thesis, where the traits identified would relate to a particular pathology, directly connected to the speech production system. The Obstructive Sleep Apnea syndrome (OSA) is a paradigmatic case for analysis. It is a disorder with high prevalence among adults and affecting a larger number of them as they grow older. Patients suffering from this disorder experience episodes of involuntary cessation of breath during sleep that may last a few seconds and reproduce throughout the night, preventing proper rest. In the case of obstructive apnea, these episodes are related to the collapse of the pharynx, which interrupts the air flow. Currently, OSA diagnosis is done through a polysomnographic study, which focuses on the analysis of apnea episodes during sleep, requiring the patient to stay at the hospital for the whole night. The complexity and high cost of the procedures involved, combined with the waiting lists, have evidenced the need for screening techniques, which perhaps would not achieve outstanding performance rates but would allow clinicians to reorganize these lists ranking patients according to the severity of their condition. Among others, imaging diagnosis and anthropometric characterization of patients have evidenced the existence of anatomical patterns related to OSA that have direct influence on speech. Contributions devoted to the study of how this disorder affects scpeech are scarce and somehow contradictory. However, since the late 1980s the existence of specific patterns related to articulation, phonation and resonance is known. By that time these descriptions were virtually useless when coming to the development of an automatic system, but pointed out the existence of a link between speech and OSA. In recent years automatic processing techniques have evolved and are now able to identify significant differences in the speech of OSAS patients when compared to records from healthy subjects. Nevertheless, little is known about the connection between these new results with those published in the past and the pathogenesis of the OSA syndrome. This Thesis is aimed to progress beyond the previous research done in this area by addressing: the study of how OSA affects patients’ speech, the enhancement of automatic OSA classification based on speech analysis, and its integration with the information embedded in the predictors generally used by clinicians in preliminary patients’ examination. The first two tasks, though may appear symbiotic at first, are quite different. While studying the connection between speech and OSA requires simple narrow models that can be easily interpreted, classification requires larger models including a large number dimensions for the characterization and posterior identification of the observed patterns. Anyhow, it is clear that any progress made in the first task should allow us to improve our performance on the second one, and that the incorporation of the predictors used by clinicians shall contribute in this same direction. The Thesis considers both continuous and sustained speech analysis, to exploit the synergies and differences between them. On continuous speech analysis, a conventional speech processing scheme, designed and evaluated before this Thesis, was taken as a baseline. Over this initial system several alternative representations of the speech information were proposed, optimized and tested to select those more suitable for the characterization of OSA-specific patterns. Evidences were found on the existence of a connection between OSA and the fundamental constituents of the speech: the formants. Experimental results proved that the success of the proposed solution is well explained by the ability of speech representations to describe these specific OSA-related components, ignoring the noisy ones as well those presenting low discrimination capabilities. The resulting scheme obtained a 18% error rate, on a classification scheme significantly less complex than those described in the literature and operating on a single speech record. Regarding the connection between OSA and the observed patterns, it was necessary to consider inter-and intra-group differences for this analysis, and to focus on the articulation, replacing the complex classification models by the long-term average spectra. Results clearly point to certain regions on the frequency axis, suggesting the existence of a systematic narrowing in the vocal tract section at the oropharynx. This was already described in the pathogenesis of this syndrome. Regarding sustained speech, similar experiments as those conducted on continuous speech were reproduced on sustained phonations of vowel / a /. Results were qualitatively similar to the previous ones, though in this case perfomance rates were found to be noticeably lower. Trying to derive further knowledge from this result, experiments on the long-term average spectra and intraand inter-group variability ratios were also reproduced on sustained speech records. Results on both experiments showed significant differences from the previous ones obtained from continuous speech which could explain the differences observed on peformance. However, sustained speech also provided the opportunity to study phonation within the controlled framework it provides. This was also identified in the literature as a source of information for the detection of OSA. In this study it was found that, for the available dataset, no sistematic differences related to phonation could be found between the two groups of speakers. Only those dimensions which relate energy distribution along the frequency axis provided significant differences, pointing once again towards the direction of resonant components. Once classification schemes on both continuous and sustained speech were developed, the Thesis addressed their combination into a single classification system. Under the assumption that the information in continuous and sustained speech is fundamentally different, it should be possible to successfully merge the two of them. This was tested through a simple fusion scheme which obtained a 88.6% correct classification (11.4% error rate), which represents a significant improvement over the state of the art. Finally, the combination of this classifier with the variables used by clinicians obtained a 91.3% accuracy (8.7% error rate). This is within the range of alternative, but costly and intrusive schemes, which unlike the one proposed can not be used in the preliminary assessment of patients’ condition. In the end, this Thesis has shed new light on the underlying connection between OSA and speech, and evidenced the degree of maturity reached by speech technology on OSA characterization and detection, leaving the door open for future research which shall continue in the multiple directions that have been pointed out and left as future work.
Resumo:
Esta tesis presenta un novedoso marco de referencia para el análisis y optimización del retardo de codificación y descodificación para vídeo multivista. El objetivo de este marco de referencia es proporcionar una metodología sistemática para el análisis del retardo en codificadores y descodificadores multivista y herramientas útiles en el diseño de codificadores/descodificadores para aplicaciones con requisitos de bajo retardo. El marco de referencia propuesto caracteriza primero los elementos que tienen influencia en el comportamiento del retardo: i) la estructura de predicción multivista, ii) el modelo hardware del codificador/descodificador y iii) los tiempos de proceso de cuadro. En segundo lugar, proporciona algoritmos para el cálculo del retardo de codificación/ descodificación de cualquier estructura arbitraria de predicción multivista. El núcleo de este marco de referencia consiste en una metodología para el análisis del retardo de codificación/descodificación multivista que es independiente de la arquitectura hardware del codificador/descodificador, completada con un conjunto de modelos que particularizan este análisis del retardo con las características de la arquitectura hardware del codificador/descodificador. Entre estos modelos, aquellos basados en teoría de grafos adquieren especial relevancia debido a su capacidad de desacoplar la influencia de los diferentes elementos en el comportamiento del retardo en el codificador/ descodificador, mediante una abstracción de su capacidad de proceso. Para revelar las posibles aplicaciones de este marco de referencia, esta tesis presenta algunos ejemplos de su utilización en problemas de diseño que afectan a codificadores y descodificadores multivista. Este escenario de aplicación cubre los siguientes casos: estrategias para el diseño de estructuras de predicción que tengan en consideración requisitos de retardo además del comportamiento tasa-distorsión; diseño del número de procesadores y análisis de los requisitos de velocidad de proceso en codificadores/ descodificadores multivista dado un retardo objetivo; y el análisis comparativo del comportamiento del retardo en codificadores multivista con diferentes capacidades de proceso e implementaciones hardware. ABSTRACT This thesis presents a novel framework for the analysis and optimization of the encoding and decoding delay for multiview video. The objective of this framework is to provide a systematic methodology for the analysis of the delay in multiview encoders and decoders and useful tools in the design of multiview encoders/decoders for applications with low delay requirements. The proposed framework characterizes firstly the elements that have an influence in the delay performance: i) the multiview prediction structure ii) the hardware model of the encoder/decoder and iii) frame processing times. Secondly, it provides algorithms for the computation of the encoding/decoding delay of any arbitrary multiview prediction structure. The core of this framework consists in a methodology for the analysis of the multiview encoding/decoding delay that is independent of the hardware architecture of the encoder/decoder, which is completed with a set of models that particularize this delay analysis with the characteristics of the hardware architecture of the encoder/decoder. Among these models, the ones based in graph theory acquire special relevance due to their capacity to detach the influence of the different elements in the delay performance of the encoder/decoder, by means of an abstraction of its processing capacity. To reveal possible applications of this framework, this thesis presents some examples of its utilization in design problems that affect multiview encoders and decoders. This application scenario covers the following cases: strategies for the design of prediction structures that take into consideration delay requirements in addition to the rate-distortion performance; design of number of processors and analysis of processor speed requirements in multiview encoders/decoders given a target delay; and comparative analysis of the encoding delay performance of multiview encoders with different processing capabilities and hardware implementations.
Resumo:
The postprocessing or secret-key distillation process in quantum key distribution (QKD) mainly involves two well-known procedures: information reconciliation and privacy amplification. Information or key reconciliation has been customarily studied in terms of efficiency. During this, some information needs to be disclosed for reconciling discrepancies in the exchanged keys. The leakage of information is lower bounded by a theoretical limit, and is usually parameterized by the reconciliation efficiency (or inefficiency), i.e. the ratio of additional information disclosed over the Shannon limit. Most techniques for reconciling errors in QKD try to optimize this parameter. For instance, the well-known Cascade (probably the most widely used procedure for reconciling errors in QKD) was recently shown to have an average efficiency of 1.05 at the cost of a high interactivity (number of exchanged messages). Modern coding techniques, such as rate-adaptive low-density parity-check (LDPC) codes were also shown to achieve similar efficiency values exchanging only one message, or even better values with few interactivity and shorter block-length codes.
Resumo:
Several issues concerning the current use of speech interfaces are discussed and the design and development of a speech interface that enables air traffic controllers to command and control their terminals by voice is presented. A special emphasis is made in the comparison between laboratory experiments and field experiments in which a set of ergonomics-related effects are detected that cannot be observed in the controlled laboratory experiments. The paper presents both objective and subjective performance obtained in field evaluation of the system with student controllers at an air traffic control (ATC) training facility. The system exhibits high word recognition test rates (0.4% error in Spanish and 1.5% in English) and low command error (6% error in Spanish and 10.6% error in English in the field tests). Subjective impression has also been positive, encouraging future development and integration phases in the Spanish ATC terminals designed by Aeropuertos Españoles y Navegación Aérea (AENA).
Resumo:
The purpose of this paper is to provide information on the behaviour of steel prestressing wires under likely conditions that could be expected during a fire or impact loads. Four loadings were investigated: a) the influence of strain rate – from 10–3 to 600 s–1 – at room temperature, b) the influence of temperature – from 24 to 600 °C – at low strain rate, c) the influence of the joint effect of strain rate and temperature, and d) damage after three plausible fire scenarios. At room temperature it was found that using “static” values is a safe option. At high temperatures our results are in agreement with design codes. Regarding the joint effect of temperature and strain rate, mechanical properties decrease with increasing temperature, although for a given temperature, yield stress and tensile strength increase with strain rate. The data provided can be used profitably to model the mechanical behaviour of steel wires under different scenarios.
Resumo:
This paper proposes the use of Factored Translation Models (FTMs) for improving a Speech into Sign Language Translation System. These FTMs allow incorporating syntactic-semantic information during the translation process. This new information permits to reduce significantly the translation error rate. This paper also analyses different alternatives for dealing with the non-relevant words. The speech into sign language translation system has been developed and evaluated in a specific application domain: the renewal of Identity Documents and Driver’s License. The translation system uses a phrase-based translation system (Moses). The evaluation results reveal that the BLEU (BiLingual Evaluation Understudy) has improved from 69.1% to 73.9% and the mSER (multiple references Sign Error Rate) has been reduced from 30.6% to 24.8%.
Resumo:
This paper presents a low-power, high-speed 4-data-path 128-point mixed-radix (radix-2 & radix-2 2 ) FFT processor for MB-OFDM Ultra-WideBand (UWB) systems. The processor employs the single-path delay feedback (SDF) pipelined structure for the proposed algorithm, it uses substructure-sharing multiplication units and shift-add structure other than traditional complex multipliers. Furthermore, the word lengths are properly chosen, thus the hardware costs and power consumption of the proposed FFT processor are efficiently reduced. The proposed FFT processor is verified and synthesized by using 0.13 µm CMOS technology with a supply voltage of 1.32 V. The implementation results indicate that the proposed 128-point mixed-radix FFT architecture supports a throughput rate of 1Gsample/s with lower power consumption in comparison to existing 128-point FFT architectures
Resumo:
OntoTag - A Linguistic and Ontological Annotation Model Suitable for the Semantic Web
1. INTRODUCTION. LINGUISTIC TOOLS AND ANNOTATIONS: THEIR LIGHTS AND SHADOWS
Computational Linguistics is already a consolidated research area. It builds upon the results of other two major ones, namely Linguistics and Computer Science and Engineering, and it aims at developing computational models of human language (or natural language, as it is termed in this area). Possibly, its most well-known applications are the different tools developed so far for processing human language, such as machine translation systems and speech recognizers or dictation programs.
These tools for processing human language are commonly referred to as linguistic tools. Apart from the examples mentioned above, there are also other types of linguistic tools that perhaps are not so well-known, but on which most of the other applications of Computational Linguistics are built. These other types of linguistic tools comprise POS taggers, natural language parsers and semantic taggers, amongst others. All of them can be termed linguistic annotation tools.
Linguistic annotation tools are important assets. In fact, POS and semantic taggers (and, to a lesser extent, also natural language parsers) have become critical resources for the computer applications that process natural language. Hence, any computer application that has to analyse a text automatically and ‘intelligently’ will include at least a module for POS tagging. The more an application needs to ‘understand’ the meaning of the text it processes, the more linguistic tools and/or modules it will incorporate and integrate.
However, linguistic annotation tools have still some limitations, which can be summarised as follows:
1. Normally, they perform annotations only at a certain linguistic level (that is, Morphology, Syntax, Semantics, etc.).
2. They usually introduce a certain rate of errors and ambiguities when tagging. This error rate ranges from 10 percent up to 50 percent of the units annotated for unrestricted, general texts.
3. Their annotations are most frequently formulated in terms of an annotation schema designed and implemented ad hoc.
A priori, it seems that the interoperation and the integration of several linguistic tools into an appropriate software architecture could most likely solve the limitations stated in (1). Besides, integrating several linguistic annotation tools and making them interoperate could also minimise the limitation stated in (2). Nevertheless, in the latter case, all these tools should produce annotations for a common level, which would have to be combined in order to correct their corresponding errors and inaccuracies. Yet, the limitation stated in (3) prevents both types of integration and interoperation from being easily achieved.
In addition, most high-level annotation tools rely on other lower-level annotation tools and their outputs to generate their own ones. For example, sense-tagging tools (operating at the semantic level) often use POS taggers (operating at a lower level, i.e., the morphosyntactic) to identify the grammatical category of the word or lexical unit they are annotating. Accordingly, if a faulty or inaccurate low-level annotation tool is to be used by other higher-level one in its process, the errors and inaccuracies of the former should be minimised in advance. Otherwise, these errors and inaccuracies would be transferred to (and even magnified in) the annotations of the high-level annotation tool.
Therefore, it would be quite useful to find a way to
(i) correct or, at least, reduce the errors and the inaccuracies of lower-level linguistic tools;
(ii) unify the annotation schemas of different linguistic annotation tools or, more generally speaking, make these tools (as well as their annotations) interoperate.
Clearly, solving (i) and (ii) should ease the automatic annotation of web pages by means of linguistic tools, and their transformation into Semantic Web pages (Berners-Lee, Hendler and Lassila, 2001). Yet, as stated above, (ii) is a type of interoperability problem. There again, ontologies (Gruber, 1993; Borst, 1997) have been successfully applied thus far to solve several interoperability problems. Hence, ontologies should help solve also the problems and limitations of linguistic annotation tools aforementioned.
Thus, to summarise, the main aim of the present work was to combine somehow these separated approaches, mechanisms and tools for annotation from Linguistics and Ontological Engineering (and the Semantic Web) in a sort of hybrid (linguistic and ontological) annotation model, suitable for both areas. This hybrid (semantic) annotation model should (a) benefit from the advances, models, techniques, mechanisms and tools of these two areas; (b) minimise (and even solve, when possible) some of the problems found in each of them; and (c) be suitable for the Semantic Web. The concrete goals that helped attain this aim are presented in the following section.
2. GOALS OF THE PRESENT WORK
As mentioned above, the main goal of this work was to specify a hybrid (that is, linguistically-motivated and ontology-based) model of annotation suitable for the Semantic Web (i.e. it had to produce a semantic annotation of web page contents). This entailed that the tags included in the annotations of the model had to (1) represent linguistic concepts (or linguistic categories, as they are termed in ISO/DCR (2008)), in order for this model to be linguistically-motivated; (2) be ontological terms (i.e., use an ontological vocabulary), in order for the model to be ontology-based; and (3) be structured (linked) as a collection of ontology-based