182 resultados para Aprendizaje automático (Inteligencia artificial)
Resumo:
The need to digitise music scores has led to the development of Optical Music Recognition (OMR) tools. Unfortunately, the performance of these systems is still far from providing acceptable results. This situation forces the user to be involved in the process due to the need of correcting the mistakes made during recognition. However, this correction is performed over the output of the system, so these interventions are not exploited to improve the performance of the recognition. This work sets the scenario in which human and machine interact to accurately complete the OMR task with the least possible effort for the user.
Resumo:
In this paper, a multimodal and interactive prototype to perform music genre classification is presented. The system is oriented to multi-part files in symbolic format but it can be adapted using a transcription system to transform audio content in music scores. This prototype uses different sources of information to give a possible answer to the user. It has been developed to allow a human expert to interact with the system to improve its results. In its current implementation, it offers a limited range of interaction and multimodality. Further development aimed at full interactivity and multimodal interactions is discussed.
Resumo:
El campo de procesamiento de lenguaje natural (PLN), ha tenido un gran crecimiento en los últimos años; sus áreas de investigación incluyen: recuperación y extracción de información, minería de datos, traducción automática, sistemas de búsquedas de respuestas, generación de resúmenes automáticos, análisis de sentimientos, entre otras. En este artículo se presentan conceptos y algunas herramientas con el fin de contribuir al entendimiento del procesamiento de texto con técnicas de PLN, con el propósito de extraer información relevante que pueda ser usada en un gran rango de aplicaciones. Se pueden desarrollar clasificadores automáticos que permitan categorizar documentos y recomendar etiquetas; estos clasificadores deben ser independientes de la plataforma, fácilmente personalizables para poder ser integrados en diferentes proyectos y que sean capaces de aprender a partir de ejemplos. En el presente artículo se introducen estos algoritmos de clasificación, se analizan algunas herramientas de código abierto disponibles actualmente para llevar a cabo estas tareas y se comparan diversas implementaciones utilizando la métrica F en la evaluación de los clasificadores.
Resumo:
Paper submitted to MML 2013, 6th International Workshop on Machine Learning and Music, Prague, September 23, 2013.
Resumo:
Several recent works deal with 3D data in mobile robotic problems, e.g., mapping. Data comes from any kind of sensor (time of flight, Kinect or 3D lasers) that provide a huge amount of unorganized 3D data. In this paper we detail an efficient approach to build complete 3D models using a soft computing method, the Growing Neural Gas (GNG). As neural models deal easily with noise, imprecision, uncertainty or partial data, GNG provides better results than other approaches. The GNG obtained is then applied to a sequence. We present a comprehensive study on GNG parameters to ensure the best result at the lowest time cost. From this GNG structure, we propose to calculate planar patches and thus obtaining a fast method to compute the movement performed by a mobile robot by means of a 3D models registration algorithm. Final results of 3D mapping are also shown.
Resumo:
Several recent works deal with 3D data in mobile robotic problems, e.g. mapping or egomotion. Data comes from any kind of sensor such as stereo vision systems, time of flight cameras or 3D lasers, providing a huge amount of unorganized 3D data. In this paper, we describe an efficient method to build complete 3D models from a Growing Neural Gas (GNG). The GNG is applied to the 3D raw data and it reduces both the subjacent error and the number of points, keeping the topology of the 3D data. The GNG output is then used in a 3D feature extraction method. We have performed a deep study in which we quantitatively show that the use of GNG improves the 3D feature extraction method. We also show that our method can be applied to any kind of 3D data. The 3D features obtained are used as input in an Iterative Closest Point (ICP)-like method to compute the 6DoF movement performed by a mobile robot. A comparison with standard ICP is performed, showing that the use of GNG improves the results. Final results of 3D mapping from the egomotion calculated are also shown.
Resumo:
Self-organising neural models have the ability to provide a good representation of the input space. In particular the Growing Neural Gas (GNG) is a suitable model because of its flexibility, rapid adaptation and excellent quality of representation. However, this type of learning is time-consuming, especially for high-dimensional input data. Since real applications often work under time constraints, it is necessary to adapt the learning process in order to complete it in a predefined time. This paper proposes a Graphics Processing Unit (GPU) parallel implementation of the GNG with Compute Unified Device Architecture (CUDA). In contrast to existing algorithms, the proposed GPU implementation allows the acceleration of the learning process keeping a good quality of representation. Comparative experiments using iterative, parallel and hybrid implementations are carried out to demonstrate the effectiveness of CUDA implementation. The results show that GNG learning with the proposed implementation achieves a speed-up of 6× compared with the single-threaded CPU implementation. GPU implementation has also been applied to a real application with time constraints: acceleration of 3D scene reconstruction for egomotion, in order to validate the proposal.
Resumo:
Paper submitted to the 39th International Symposium on Robotics ISR 2008, Seoul, South Korea, October 15-17, 2008.
Resumo:
Nowadays, the use of RGB-D sensors have focused a lot of research in computer vision and robotics. These kinds of sensors, like Kinect, allow to obtain 3D data together with color information. However, their working range is limited to less than 10 meters, making them useless in some robotics applications, like outdoor mapping. In these environments, 3D lasers, working in ranges of 20-80 meters, are better. But 3D lasers do not usually provide color information. A simple 2D camera can be used to provide color information to the point cloud, but a calibration process between camera and laser must be done. In this paper we present a portable calibration system to calibrate any traditional camera with a 3D laser in order to assign color information to the 3D points obtained. Thus, we can use laser precision and simultaneously make use of color information. Unlike other techniques that make use of a three-dimensional body of known dimensions in the calibration process, this system is highly portable because it makes use of small catadioptrics that can be placed in a simple manner in the environment. We use our calibration system in a 3D mapping system, including Simultaneous Location and Mapping (SLAM), in order to get a 3D colored map which can be used in different tasks. We show that an additional problem arises: 2D cameras information is different when lighting conditions change. So when we merge 3D point clouds from two different views, several points in a given neighborhood could have different color information. A new method for color fusion is presented, obtaining correct colored maps. The system will be tested by applying it to 3D reconstruction.
Resumo:
Paper submitted to the 43rd International Symposium on Robotics (ISR2012), Taipei, Taiwan, Aug. 29-31, 2012.
Resumo:
A parallel algorithm for image noise removal is proposed. The algorithm is based on peer group concept and uses a fuzzy metric. An optimization study on the use of the CUDA platform to remove impulsive noise using this algorithm is presented. Moreover, an implementation of the algorithm on multi-core platforms using OpenMP is presented. Performance is evaluated in terms of execution time and a comparison of the implementation parallelised in multi-core, GPUs and the combination of both is conducted. A performance analysis with large images is conducted in order to identify the amount of pixels to allocate in the CPU and GPU. The observed time shows that both devices must have work to do, leaving the most to the GPU. Results show that parallel implementations of denoising filters on GPUs and multi-cores are very advisable, and they open the door to use such algorithms for real-time processing.
Resumo:
El particionado hardware/software es una tarea fundamental en el co-diseño de sistemas embebidos. En ella se decide, teniendo en cuenta las métricas de diseño, qué componentes se ejecutarán en un procesador de propósito general (software) y cuáles en un hardware específico. En los últimos años se han propuesto diversas soluciones al problema del particionado dirigidas por algoritmos metaheurísticos. Sin embargo, debido a la diversidad de modelos y métricas utilizadas, la elección del algoritmo más apropiado sigue siendo un problema abierto. En este trabajo se presenta una comparación de seis algoritmos metaheurísticos: Búsqueda aleatoria (Random search), Búsqueda tabú (Tabu search), Recocido simulado (Simulated annealing), Escalador de colinas estocástico (Stochastic hill climbing), Algoritmo genético (Genetic algorithm) y Estrategia evolutiva (Evolution strategy). El modelo utilizado en la comparación está dirigido a minimizar el área ocupada y el tiempo de ejecución, las restricciones del modelo son consideradas como penalizaciones para incluir en el espacio de búsqueda otras soluciones. Los resultados muestran que los algoritmos Escalador de colinas estocástico y Estrategia evolutiva son los que mejores resultados obtienen en general, seguidos por el Algoritmo genético.
Resumo:
Feature vectors can be anything from simple surface normals to more complex feature descriptors. Feature extraction is important to solve various computer vision problems: e.g. registration, object recognition and scene understanding. Most of these techniques cannot be computed online due to their complexity and the context where they are applied. Therefore, computing these features in real-time for many points in the scene is impossible. In this work, a hardware-based implementation of 3D feature extraction and 3D object recognition is proposed to accelerate these methods and therefore the entire pipeline of RGBD based computer vision systems where such features are typically used. The use of a GPU as a general purpose processor can achieve considerable speed-ups compared with a CPU implementation. In this work, advantageous results are obtained using the GPU to accelerate the computation of a 3D descriptor based on the calculation of 3D semi-local surface patches of partial views. This allows descriptor computation at several points of a scene in real-time. Benefits of the accelerated descriptor have been demonstrated in object recognition tasks. Source code will be made publicly available as contribution to the Open Source Point Cloud Library.
Resumo:
Nowadays, there is an increasing number of robotic applications that need to act in real three-dimensional (3D) scenarios. In this paper we present a new mobile robotics orientated 3D registration method that improves previous Iterative Closest Points based solutions both in speed and accuracy. As an initial step, we perform a low cost computational method to obtain descriptions for 3D scenes planar surfaces. Then, from these descriptions we apply a force system in order to compute accurately and efficiently a six degrees of freedom egomotion. We describe the basis of our approach and demonstrate its validity with several experiments using different kinds of 3D sensors and different 3D real environments.
Resumo:
El documento de esta tesis por compendio de publicaciones se divide en dos partes: la síntesis donde se resume la fundamentación, resultados y conclusiones de esta tesis, y las propias publicaciones en su formato original, que se incluyen como apéndices. Dado que existen acuerdo de confidencialidad (véase "Derechos" más adelante) que impiden su publicación en formato electrónico de forma pública y abierta (como es el repositorio de la UA), y acorde con lo que se dictamina en el punto 6 del artículo 14 del RD 99/2011, de 28 de enero, no se incluyen estos apéndices en el documento electrónico que se presenta en cedé, pero se incluyen las referencias completas y sí se incluyen integramente en el ejemplar encuadernado. Si el CEDIP y el RUA así lo decidiesen más adelante, podría modificarse este documento electrónico para incluir los enlaces a los artículos originales.