951 resultados para Visual Speech Recognition, Multiple Views, Frontal View, Profile View
Resumo:
Although there has been a lot of interest in recognizing and understanding air traffic control (ATC) speech, none of the published works have obtained detailed field data results. We have developed a system able to identify the language spoken and recognize and understand sentences in both Spanish and English. We also present field results for several in-tower controller positions. To the best of our knowledge, this is the first time that field ATC speech (not simulated) is captured, processed, and analyzed. The use of stochastic grammars allows variations in the standard phraseology that appear in field data. The robust understanding algorithm developed has 95% concept accuracy from ATC text input. It also allows changes in the presentation order of the concepts and the correction of errors created by the speech recognition engine improving it by 17% and 25%, respectively, absolute in the percentage of fully correctly understood sentences for English and Spanish in relation to the percentages of fully correctly recognized sentences. The analysis of errors due to the spontaneity of the speech and its comparison to read speech is also carried out. A 96% word accuracy for read speech is reduced to 86% word accuracy for field ATC data for Spanish for the "clearances" task confirming that field data is needed to estimate the performance of a system. A literature review and a critical discussion on the possibilities of speech recognition and understanding technology applied to ATC speech are also given.
Resumo:
We present a novel approach for the detection of severe obstructive sleep apnea (OSA) based on patients' voices introducing nonlinear measures to describe sustained speech dynamics. Nonlinear features were combined with state-of-the-art speech recognition systems using statistical modeling techniques (Gaussian mixture models, GMMs) over cepstral parameterization (MFCC) for both continuous and sustained speech. Tests were performed on a database including speech records from both severe OSA and control speakers. A 10 % relative reduction in classification error was obtained for sustained speech when combining MFCC-GMM and nonlinear features, and 33 % when fusing nonlinear features with both sustained and continuous MFCC-GMM. Accuracy reached 88.5 % allowing the system to be used in OSA early detection. Tests showed that nonlinear features and MFCCs are lightly correlated on sustained speech, but uncorrelated on continuous speech. Results also suggest the existence of nonlinear effects in OSA patients' voices, which should be found in continuous speech.
Resumo:
This paper presents a methodology for adapting an advanced communication system for deaf people in a new domain. This methodology is a user-centered design approach consisting of four main steps: requirement analysis, parallel corpus generation, technology adaptation to the new domain, and finally, system evaluation. In this paper, the new considered domain has been the dialogues in a hotel reception. With this methodology, it was possible to develop the system in a few months, obtaining very good performance: good speech recognition and translation rates (around 90%) with small processing times.
Diseño de un videojuego orientado a mejorar el proceso de enseñanza-aprendizaje de la lengua inglesa
Resumo:
Desde que el proceso de la globalización empezó a tener efectos en la sociedad actual, la lengua inglesa se ha impuesto como primera opción de comunicación entre las grandes empresas y sobre todo en el ámbito de los negocios. Por estos motivos se hace necesario el conocimiento de esta lengua que con el paso de los años ha ido creciendo en número de hablantes. Cada vez son más las personas que quieren dominar la lengua inglesa. El aprendizaje en esta doctrina se va iniciando en edades muy tempranas, facilitando y mejorando así la adquisición de una base de conocimientos con todas las destrezas que tiene la lengua inglesa: lectura, escritura, expresión oral y comprensión oral. Con este proyecto se quiso mejorar el proceso de enseñanza-aprendizaje de la lengua inglesa en un rango de población menor de 13 años. Se propuso crear un método de aprendizaje que motivara al usuario y le reportase una ayuda constante durante su progreso en el conocimiento de la lengua inglesa. El mejor método que se pensó para llevar a cabo este objetivo fue la realización de un videojuego que cumpliese todas las características propuestas anteriormente. Un videojuego de aprendizaje en inglés, que además incluyese algo tan novedoso como el reconocimiento de voz para mejorar la expresión oral del usuario, ayudaría a la población a mejorar el nivel de inglés básico en todas las destrezas así como el establecimiento de una base sólida que serviría para asentar mejor futuros conocimientos más avanzados. ABSTRACT Since Globalization began to have an effect on today's society, the English language has emerged as the first choice for communication among companies and especially in the field of business. Therefore, the command of this language, which over the years has grown in number of speakers, has become more and more necessary. Increasingly people want to master the English language. They start learning at very early age, thus facilitating and improving the acquisition of a new knowledge like English language. The skills of English must be practiced are: reading, writing, listening and speaking. If people learnt all these skills, they could achieve a high level of English. In this project, the aim is to improve the process of teaching and learning English in a range of population less than 13 years. To do so, an interactive learning video game that motivates the users and brings them constant help during their progress in the learning of the English language is designed. The video game designed to learn English, also includes some novelties from the point of view of the technology used as is speech recognition. The aim of this integration is to improve speaking skills of users, who will therefore improve the standard of English in all four basic learning skills and establish a solid base that would facilitate the acquisition of future advanced knowledge.
Resumo:
This paper describes the GTH-UPM system for the Albayzin 2014 Search on Speech Evaluation. Teh evaluation task consists of searching a list of terms/queries in audio files. The GTH-UPM system we are presenting is based on a LVCSR (Large Vocabulary Continuous Speech Recognition) system. We have used MAVIR corpus and the Spanish partition of the EPPS (European Parliament Plenary Sessions) database for training both acoustic and language models. The main effort has been focused on lexicon preparation and text selection for the language model construction. The system makes use of different lexicon and language models depending on the task that is performed. For the best configuration of the system on the development set, we have obtained a FOM of 75.27 for the deyword spotting task.
Resumo:
Las casas del siglo XX construidas al borde del mar –escenario único y origen de su expresión- siguen la topografía del suelo que habitan en su descenso al agua, y organizan espacios que obtienen miradas al horizonte marino. El horizonte nos confronta a lo desconocido. La vista del mar incita al deseo de recorrerlo, al deseo de viajar. Con origen en el otium romano, la presencia del agua define un modo de vida apacible –epicúreo- que los viajeros de la arquitectura -que nos acompañan en la tesis- traducen en sus refugios más íntimos. Experimentan con los cambios en los conceptos y en las técnicas, que se trasladan fácilmente a la arquitectura de la casa al borde del agua desde los inicios del Movimiento Moderno. Sus espacios del habitar frente al mar nos permiten descubrir estrategias comunes en sus respuestas más modernas. El arquitecto ante el proyecto frente al mar llega a la ubicación elegida, mira hacia el horizonte, y desplazándose ladera arriba se coloca en un punto estratégico que elige; después, delante de su tablero, recorre el camino inverso, coloca el lugar y sobre él dibuja los elementos que configurarán los espacios de la casa buscando esa mirada al mar. Las situaciones y ubicaciones posibles son infinitas; se definen algunas consonancias espaciales comunes en los suelos que se ocupan debidas a la presencia del mar que asociamos entre sí. De la comparación entre todas las casas -emblemas del XX-, surgen múltiples variantes de la mirada y de espacios al abierto, y modos de fabricar entornos con criterios comunes para dominar la visión del mar. Interiores que se abren al panorama, espacios cuyas ventanas buscan su mirada en la extensión del horizonte, rescisiones y aperturas. Reconocemos condicionantes en el territorio a los que las villas responden, categorías arquitectónicas que dan respuesta frente al mar en la búsqueda del moderno, la topografía, la mirada y el espacio al abierto. Las casas comparten la idea del dominio del paisaje desde el punto más alto, y en algunos suelos se invierte la tipología por la topografía, confirmando así un criterio común basado en la lectura del suelo como consecuencia de la búsqueda del espacio de la mirada. Los espacios al abierto se significan en todas ellas, son espacios al -aire libre- abiertos, unos envueltos, otros porticados, puertas del horizonte que se abren al exterior, en el techo de la casa, otros cubiertos y abiertos, espacios entre interior y exterior, en plataformas con bancales o patios envolventes, recintos o habitaciones abiertas. Descubrimos un logro del XX en los espacios positivos o negativos que traducen o juegan con el entorno, que ocupan o sustraen de los contornos construidos y que obtienen espacios intermedios en la búsqueda de la relación con el mar. Las herramientas que se utilizan son los dibujos de los autores, de las casas visitadas, el elenco de viajeros y sus viajes, el conocimiento desde el estudio de los proyectos. A través de la comparación por aproximaciones parciales, los dibujos nos definen la mirada al mar, el modo de ocupación y la forma de relación con el paisaje. La arquitectura del habitar frente al mar en el XX, hecha para y por arquitectos, topografía el suelo y construye la mirada, fabricando espacios al abierto en la relación entre la casa y el entorno marítimo. ABSTRACT Houses of the 20th century built by the sea – a unique setting which gives rise to their expression – follow the topography of the land they occupy in its descent towards the sea, and they organize spaces which give views of the maritime horizon. The horizon brings us face to face with the unknown. The sea view provokes a desire to cross it, to travel. The presence of the sea defines a peaceful, epicurean way of life, with origins in the Roman otium, which architectural travellers – who accompany us through the thesis – translate into their most intimate retreats. They experiment with changes in concepts and techniques, which are easily transferred to the architecture of the seaside house since the beginnings of the Modern Movement. Their living spaces allow us to discover common strategies in the most modern responses. The architect with a seaside project arrives at the site, looks towards the horizon, then walks uphill and chooses a strategic point; then with his drawing board he retraces his steps, he sets the position and then draws in the elements that make up the house that seeks a sea view. The number of potential situations and locations is infinite; certain common spatial accordances are defined in land which is occupied due to the presence of the sea. Comparison of all the houses – 20th century emblems – throws up multiple variations of view and open spaces, and ways of creating settings with common criteria so as to command the vision of the sea. Interiors which open up to the panorama, spaces whose windows seek their view in the expanse of the horizon, openings and closures. We recognise determinant factors in the territory to which the villas respond, architectural categories which give a seaside solution to the search for the modern, the topography, the view, and the open space. The houses share the idea of dominating the landscape from the highest point, and in some areas typology and topography are inverted, thus confirming a common criteria based on the reading of the ground as a conse quence of the search for the view space. Open spaces stand out in all the villas – spaces open to the outdoor air - some are wrapped, some arcaded, doors to the horizon which open up to the exterior, on the roof of the house. There are open and covered spaces, spaces between the exterior and interior, on platforms with banks and surrounding patios, enclosures and open rooms. We discover an achievement of the 20th century in the positive and negative spaces which translate and play with the setting, which occupy or are extracted from built contours and which obtain intermediate spaces in the search for the relationship with the sea. The tools used are the author’s drawings of the houses visited, the cast of travelling companions and their travels, the knowledge gained from study of the projects. Through comparison by means of partial approaches, the drawings define the view of the sea, the occupation mode and the way of relating to the landscape. Architecture for living by the sea in the 20th century, carried out both by and for the architects, shapes the land and constructs the view, creating open spaces in the relationship between the house and the sea surroundings.
Resumo:
Human Activity Recognition (HAR) is an emerging research field with the aim to identify the actions carried out by a person given a set of observations and the surrounding environment. The wide growth in this research field inside the scientific community is mainly explained by the high number of applications that are arising in the last years. A great part of the most promising applications are related to the healthcare field, where it is possible to track the mobility of patients with motor dysfunction as also the physical activity in patients with cardiovascular risk. Until a few years ago, by using distinct kind of sensors, a patient follow-up was possible. However, far from being a long-term solution and with the smartphone irruption, that monitoring can be achieved in a non-invasive way by using the embedded smartphone’s sensors. For these reasons this Final Degree Project arises with the main target to evaluate new feature extraction techniques in order to carry out an activity and user recognition, and also an activity segmentation. The recognition is done thanks to the inertial signals integration obtained by two widespread sensors in the greater part of smartphones: accelerometer and gyroscope. In particular, six different activities are evaluated walking, walking-upstairs, walking-downstairs, sitting, standing and lying. Furthermore, a segmentation task is carried out taking into account the activities performed by thirty users. This can be done by using Hidden Markov Models and also a set of tools tested satisfactory in speech recognition: HTK (Hidden Markov Model Toolkit).
Resumo:
O propósito desta pesquisa foi estudar algumas análises faciais utilizadas para diagnóstico ortodôntico e verificar a concordância entre norma lateral e frontal na avaliação da agradabilidade facial para os grupos leigos e profissionais, a concordância entre estes grupos na avaliação da agradabilidade facial nas normas lateral e frontal, bem como verificar a associação entre agradabilidade facial e Proporção Áurea, agradabilidade facial e Padrão Facial e entre Padrão Facial e Proporção Áurea. Utilizou-se 208 fotografias faciais padronizadas (104 laterais e 104 frontais) de 104 indivíduos escolhidos aleatoriamente, que primeiramente foram classificadas em agradável , aceitável e desagradável por dois grupos distintos: grupo Ortodontia e grupo Leigos . As fotografias laterais e frontais foram submetidas a medidas de Proporção Áurea Facial por meio de programa computadorizado e os indivíduos foram classificados quanto ao Padrão Facial pelo seu aspecto lateral. Após análise estatística, verificou-se que não houve concordância entre as variáveis da avaliação de agradabilidade estudadas, bem como não houve associação entre Proporção Áurea com agradabilidade facial ou com Padrão Facial. Entre agradabilidade facial e Padrão Facial, observou-se para a norma lateral associação fortemente positiva, porém para a frontal não houve associação para ambos os grupos de avaliadores.
Resumo:
O propósito desta pesquisa foi estudar algumas análises faciais utilizadas para diagnóstico ortodôntico e verificar a concordância entre norma lateral e frontal na avaliação da agradabilidade facial para os grupos leigos e profissionais, a concordância entre estes grupos na avaliação da agradabilidade facial nas normas lateral e frontal, bem como verificar a associação entre agradabilidade facial e Proporção Áurea, agradabilidade facial e Padrão Facial e entre Padrão Facial e Proporção Áurea. Utilizou-se 208 fotografias faciais padronizadas (104 laterais e 104 frontais) de 104 indivíduos escolhidos aleatoriamente, que primeiramente foram classificadas em agradável , aceitável e desagradável por dois grupos distintos: grupo Ortodontia e grupo Leigos . As fotografias laterais e frontais foram submetidas a medidas de Proporção Áurea Facial por meio de programa computadorizado e os indivíduos foram classificados quanto ao Padrão Facial pelo seu aspecto lateral. Após análise estatística, verificou-se que não houve concordância entre as variáveis da avaliação de agradabilidade estudadas, bem como não houve associação entre Proporção Áurea com agradabilidade facial ou com Padrão Facial. Entre agradabilidade facial e Padrão Facial, observou-se para a norma lateral associação fortemente positiva, porém para a frontal não houve associação para ambos os grupos de avaliadores.
Resumo:
Computer speech synthesis has reached a high level of performance, with increasingly sophisticated models of linguistic structure, low error rates in text analysis, and high intelligibility in synthesis from phonemic input. Mass market applications are beginning to appear. However, the results are still not good enough for the ubiquitous application that such technology will eventually have. A number of alternative directions of current research aim at the ultimate goal of fully natural synthetic speech. One especially promising trend is the systematic optimization of large synthesis systems with respect to formal criteria of evaluation. Speech recognition has progressed rapidly in the past decade through such approaches, and it seems likely that their application in synthesis will produce similar improvements.