48 resultados para knowledge-based system
Resumo:
A new language recognition technique based on the application of the philosophy of the Shifted Delta Coefficients (SDC) to phone log-likelihood ratio features (PLLR) is described. The new methodology allows the incorporation of long-span phonetic information at a frame-by-frame level while dealing with the temporal length of each phone unit. The proposed features are used to train an i-vector based system and tested on the Albayzin LRE 2012 dataset. The results show a relative improvement of 33.3% in Cavg in comparison with different state-of-the-art acoustic i-vector based systems. On the other hand, the integration of parallel phone ASR systems where each one is used to generate multiple PLLR coefficients which are stacked together and then projected into a reduced dimension are also presented. Finally, the paper shows how the incorporation of state information from the phone ASR contributes to provide additional improvements and how the fusion with the other acoustic and phonotactic systems provides an important improvement of 25.8% over the system presented during the competition.
Resumo:
Uno de los mayores retos para la comunidad científica es conseguir que las máquinas posean en un futuro la capacidad del sistema visual y cognitivo humanos, de forma que, por ejemplo, en entornos de video vigilancia, puedan llegar a proporcionar de manera automática una descripción fiable de lo que está ocurriendo en la escena. En la presente tesis, mediante la propuesta de un marco de trabajo de referencia, se discuten y plantean los pasos necesarios para el desarrollo de sistemas más inteligentes capaces de extraer y analizar, a diferentes niveles de abstracción y mediante distintos módulos de procesamiento independientes, la información necesaria para comprender qué está sucediendo en un conjunto amplio de escenarios de distinta naturaleza. Se parte de un análisis de requisitos y se identifican los retos para este tipo de sistemas en la actualidad, lo que constituye en sí mismo los objetivos de esta tesis, contribuyendo así a un modelo de datos basado en el conocimiento que permitirá analizar distintas situaciones en las que personas y vehículos son los actores principales, dejando no obstante la puerta abierta a la adaptación a otros dominios. Así mismo, se estudian los distintos procesos que se pueden lanzar a nivel interno así como la necesidad de integrar mecanismos de realimentación a distintos niveles que permitan al sistema adaptarse mejor a cambios en el entorno. Como resultado, se propone un marco de referencia jerárquico que integra las capacidades de percepción, interpretación y aprendizaje para superar los retos identificados en este ámbito; y así poder desarrollar sistemas de vigilancia más robustos, flexibles e inteligentes, capaces de operar en una variedad de entornos. Resultados experimentales ejecutados sobre distintas muestras de datos (secuencias de vídeo principalmente) demuestran la efectividad del marco de trabajo propuesto respecto a otros propuestos en el pasado. Un primer caso de estudio, permite demostrar la creación de un sistema de monitorización de entornos de parking en exteriores para la detección de vehículos y el análisis de plazas libres de aparcamiento. Un segundo caso de estudio, permite demostrar la flexibilidad del marco de referencia propuesto para adaptarse a los requisitos de un entorno de vigilancia completamente distinto, como es un hogar inteligente donde el análisis automático de actividades de la vida cotidiana centra la atención del estudio. ABSTRACT One of the most ambitious objectives for the Computer Vision and Pattern Recognition research community is that machines can achieve similar capacities to the human's visual and cognitive system, and thus provide a trustworthy description of what is happening in the scene under surveillance. Thus, a number of well-established scenario understanding architectural frameworks to develop applications working on a variety of environments can be found in the literature. In this Thesis, a highly descriptive methodology for the development of scene understanding applications is presented. It consists of a set of formal guidelines to let machines extract and analyse, at different levels of abstraction and by means of independent processing modules that interact with each other, the necessary information to understand a broad set of different real World surveillance scenarios. Taking into account the challenges that working at both low and high levels offer, we contribute with a highly descriptive knowledge-based data model for the analysis of different situations in which people and vehicles are the main actors, leaving the door open for the development of interesting applications in diverse smart domains. Recommendations to let systems achieve high-level behaviour understanding will be also provided. Furthermore, feedback mechanisms are proposed to be integrated in order to let any system to understand better the environment and the logical context around, reducing thus the uncertainty and noise, and increasing its robustness and precision in front of low-level or high-level errors. As a result, a hierarchical cognitive architecture of reference which integrates the necessary perception, interpretation, attention and learning capabilities to overcome main challenges identified in this area of research is proposed; thus allowing to develop more robust, flexible and smart surveillance systems to cope with the different requirements of a variety of environments. Once crucial issues that should be treated explicitly in the design of this kind of systems have been formulated and discussed, experimental results shows the effectiveness of the proposed framework compared with other proposed in the past. Two case studies were implemented to test the capabilities of the framework. The first case study presents how the proposed framework can be used to create intelligent parking monitoring systems. The second case study demonstrates the flexibility of the system to cope with the requirements of a completely different environment, a smart home where activities of daily living are performed. Finally, general conclusions and future work lines to further enhancing the capabilities of the proposed framework are presented.
Resumo:
Vector reconstruction of objects from an unstructured point cloud obtained with a LiDAR-based system (light detection and ranging) is one of the most promising methods to build three dimensional models of orchards. The cylinder fitting method for woody structure reconstruction of leafless trees from point clouds obtained with a mobile terrestrial laser scanner (MTLS) has been analysed. The advantage of this method is that it performs reconstruction in a single step. The most time consuming part of the algorithm is generation of the cylinder direction, which must be recalculated at the inclusion of each point in the cylinder. The tree skeleton is obtained at the same time as the cluster of cylinders is formed. The method does not guarantee a unique convergence and the reconstruction parameter values must be carefully chosen. A balanced processing of clusters has also been defined which has proven to be very efficient in terms of processing time by following the hierarchy of branches, predecessors and successors. The algorithm was applied to simulated MTLS of virtual orchard models and to MTLS data of real orchards. The constraints applied in the method have been reviewed to ensure better convergence and simpler use of parameters. The results obtained show a correct reconstruction of the woody structure of the trees and the algorithm runs in linear logarithmic time