Biblioteca Digital

87 resultados para Encoder

Multi-Resolution Texture Coding for Multi-Resolution 3D Meshes

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present an innovative system to encode and transmit textured multi-resolution 3D meshes in a progressive way, with no need to send several texture images, one for each mesh LOD (Level Of Detail). All texture LODs are created from the finest one (associated to the finest mesh), but can be re- constructed progressively from the coarsest thanks to refinement images calculated in the encoding process, and transmitted only if needed. This allows us to adjust the LOD/quality of both 3D mesh and texture according to the rendering power of the device that will display them, and to the network capacity. Additionally, we achieve big savings in data transmission by avoiding altogether texture coordinates, which are generated automatically thanks to an unwrapping system agreed upon by both encoder and decoder.

Estudio de calidad de codificación utilizando los perfiles 10 Bits del estándar H.264

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Desde los inicios de la codificación de vídeo digital hasta hoy, tanto la señal de video sin comprimir de entrada al codificador como la señal de salida descomprimida del decodificador, independientemente de su resolución, uso de submuestreo en los planos de diferencia de color, etc. han tenido siempre la característica común de utilizar 8 bits para representar cada una de las muestras. De la misma manera, los estándares de codificación de vídeo imponen trabajar internamente con estos 8 bits de precisión interna al realizar operaciones con las muestras cuando aún no se han transformado al dominio de la frecuencia. Sin embargo, el estándar H.264, en gran auge hoy en día, permite en algunos de sus perfiles orientados al mundo profesional codificar vídeo con más de 8 bits por muestra. Cuando se utilizan estos perfiles, las operaciones efectuadas sobre las muestras todavía sin transformar se realizan con la misma precisión que el número de bits del vídeo de entrada al codificador. Este aumento de precisión interna tiene el potencial de permitir unas predicciones más precisas, reduciendo el residuo a codificar y aumentando la eficiencia de codificación para una tasa binaria dada. El objetivo de este Proyecto Fin de Carrera es estudiar, utilizando las medidas de calidad visual objetiva PSNR (Peak Signal to Noise Ratio, relación señal ruido de pico) y SSIM (Structural Similarity, similaridad estructural), el efecto sobre la eficiencia de codificación y el rendimiento al trabajar con una cadena de codificación/descodificación H.264 de 10 bits en comparación con una cadena tradicional de 8 bits. Para ello se utiliza el codificador de código abierto x264, capaz de codificar video de 8 y 10 bits por muestra utilizando los perfiles High, High 10, High 4:2:2 y High 4:4:4 Predictive del estándar H.264. Debido a la ausencia de herramientas adecuadas para calcular las medidas PSNR y SSIM de vídeo con más de 8 bits por muestra y un tipo de submuestreo de planos de diferencia de color distinto al 4:2:0, como parte de este proyecto se desarrolla también una aplicación de análisis en lenguaje de programación C capaz de calcular dichas medidas a partir de dos archivos de vídeo sin comprimir en formato YUV o Y4M. ABSTRACT Since the beginning of digital video compression, the uncompressed video source used as input stream to the encoder and the uncompressed decoded output stream have both used 8 bits for representing each sample, independent of resolution, chroma subsampling scheme used, etc. In the same way, video coding standards force encoders to work internally with 8 bits of internal precision when working with samples before being transformed to the frequency domain. However, the H.264 standard allows coding video with more than 8 bits per sample in some of its professionally oriented profiles. When using these profiles, all work on samples still in the spatial domain is done with the same precision the input video has. This increase in internal precision has the potential of allowing more precise predictions, reducing the residual to be encoded, and thus increasing coding efficiency for a given bitrate. The goal of this Project is to study, using PSNR (Peak Signal to Noise Ratio) and SSIM (Structural Similarity) objective video quality metrics, the effects on coding efficiency and performance caused by using an H.264 10 bit coding/decoding chain compared to a traditional 8 bit chain. In order to achieve this goal the open source x264 encoder is used, which allows encoding video with 8 and 10 bits per sample using the H.264 High, High 10, High 4:2:2 and High 4:4:4 Predictive profiles. Given that no proper tools exist for computing PSNR and SSIM values of video with more than 8 bits per sample and chroma subsampling schemes other than 4:2:0, an analysis application written in the C programming language is developed as part of this Project. This application is able to compute both metrics from two uncompressed video files in the YUV or Y4M format.

A framework for the analysis and optimization of delay in multiview video coding schemes

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Esta tesis presenta un novedoso marco de referencia para el análisis y optimización del retardo de codificación y descodificación para vídeo multivista. El objetivo de este marco de referencia es proporcionar una metodología sistemática para el análisis del retardo en codificadores y descodificadores multivista y herramientas útiles en el diseño de codificadores/descodificadores para aplicaciones con requisitos de bajo retardo. El marco de referencia propuesto caracteriza primero los elementos que tienen influencia en el comportamiento del retardo: i) la estructura de predicción multivista, ii) el modelo hardware del codificador/descodificador y iii) los tiempos de proceso de cuadro. En segundo lugar, proporciona algoritmos para el cálculo del retardo de codificación/ descodificación de cualquier estructura arbitraria de predicción multivista. El núcleo de este marco de referencia consiste en una metodología para el análisis del retardo de codificación/descodificación multivista que es independiente de la arquitectura hardware del codificador/descodificador, completada con un conjunto de modelos que particularizan este análisis del retardo con las características de la arquitectura hardware del codificador/descodificador. Entre estos modelos, aquellos basados en teoría de grafos adquieren especial relevancia debido a su capacidad de desacoplar la influencia de los diferentes elementos en el comportamiento del retardo en el codificador/ descodificador, mediante una abstracción de su capacidad de proceso. Para revelar las posibles aplicaciones de este marco de referencia, esta tesis presenta algunos ejemplos de su utilización en problemas de diseño que afectan a codificadores y descodificadores multivista. Este escenario de aplicación cubre los siguientes casos: estrategias para el diseño de estructuras de predicción que tengan en consideración requisitos de retardo además del comportamiento tasa-distorsión; diseño del número de procesadores y análisis de los requisitos de velocidad de proceso en codificadores/ descodificadores multivista dado un retardo objetivo; y el análisis comparativo del comportamiento del retardo en codificadores multivista con diferentes capacidades de proceso e implementaciones hardware. ABSTRACT This thesis presents a novel framework for the analysis and optimization of the encoding and decoding delay for multiview video. The objective of this framework is to provide a systematic methodology for the analysis of the delay in multiview encoders and decoders and useful tools in the design of multiview encoders/decoders for applications with low delay requirements. The proposed framework characterizes firstly the elements that have an influence in the delay performance: i) the multiview prediction structure ii) the hardware model of the encoder/decoder and iii) frame processing times. Secondly, it provides algorithms for the computation of the encoding/decoding delay of any arbitrary multiview prediction structure. The core of this framework consists in a methodology for the analysis of the multiview encoding/decoding delay that is independent of the hardware architecture of the encoder/decoder, which is completed with a set of models that particularize this delay analysis with the characteristics of the hardware architecture of the encoder/decoder. Among these models, the ones based in graph theory acquire special relevance due to their capacity to detach the influence of the different elements in the delay performance of the encoder/decoder, by means of an abstraction of its processing capacity. To reveal possible applications of this framework, this thesis presents some examples of its utilization in design problems that affect multiview encoders and decoders. This application scenario covers the following cases: strategies for the design of prediction structures that take into consideration delay requirements in addition to the rate-distortion performance; design of number of processors and analysis of processor speed requirements in multiview encoders/decoders given a target delay; and comparative analysis of the encoding delay performance of multiview encoders with different processing capabilities and hardware implementations.

Sistema de control para optimizar la eficiencia de un equipo fotovoltaico de bombeo directo accionado por un motor de inducción

Relevância:

10.00% 10.00%

Publicador:

Resumo:

El objetivo principal está recogido en el título de la Tesis. Ampliando éste para hacerlo más explícito, puede decirse que se trata de “desarrollar un sistema de control para que una instalación fotovoltaica de bombeo directo con una bomba centrífuga accionada por un motor de inducción trabaje de la forma más eficiente posible”. Para lograr ese propósito se establecieron los siguientes objetivos específicos: 1. Diseñar y construir un prototipo de instalación fotovoltaica de bombeo directo que utilice principalmente elementos de bajo coste y alta fiabilidad. Para cumplir esos requisitos la instalación consta de un generador fotovoltaico con módulos de silicio monocristalino, una bomba centrífuga accionada por un motor de inducción y un inversor que controla vectorialmente el motor. Los módulos de silicio monocristalino, el motor asíncrono y la bomba centrífuga son, en sus respectivas categorías, los elementos más robustos y fiables que existen, pudiendo ser adquiridos, instalados e incluso reparados (el motor y la bomba) por personas con una mínima formación técnica en casi cualquier lugar del mundo. El inversor no es tan fiable ni fácil de reparar. Ahora bien, para optimizar la potencia que entrega el generador y tener algún tipo de control sobre el motor se necesita al menos un convertidor electrónico. Por tanto, la inclusión del inversor en el sistema no reduce su fiabilidad ni supone un aumento del coste. La exigencia de que el inversor pueda realizar el control vectorial del motor responde a la necesidad de optimizar tanto la operación del conjunto motor-bomba como la del generador fotovoltaico. Como más adelante se indica, lograr esa optimización es otro de los objetivos que se plantea. 2. Reducir al mínimo el número de elementos de medida y control que necesita el sistema para su operación (sensorless control). Con ello se persigue aumentar la robustez y fiabilidad del sistema y reducir sus operaciones de mantenimiento, buscando que sea lo más económico posible. Para ello se deben evitar todas las medidas que pudieran ser redundantes, tomando datos sólo de las variables eléctricas que no pueden obtenerse de otra forma (tensión e intensidad en corriente continua y dos intensidades en corriente alterna) y estimando la velocidad del rotor (en vez de medirla con un encoder u otro dispositivo equivalente). 3. Estudiar posibles formas de mejorar el diseño y la eficiencia de estas instalaciones. Se trata de establecer criterios para seleccionar los dispositivos mas eficientes o con mejor respuesta, de buscar las condiciones para la operación óptima, de corregir problemas de desacoplo entre subsistemas, etc. Mediante el análisis de cada una de las partes de las que consta la instalación se plantearán estrategias para minimizar pérdidas, pautas que permitan identificar los elementos más óptimos y procedimientos de control para que la operación del sistema pueda alcanzar la mayor eficiente posible. 4. Implementar un modelo de simulación del sistema sobre el que ensayar las estrategias de control que sean susceptibles de llevar a la práctica. Para modelar el generador fotovoltaico se requiere un conjunto de parámetros que es necesario estimar previamente a partir de datos obtenidos de los catálogos de los módulos a utilizar o mediante ensayos. Igual sucede con los parámetros para modelar el motor. Como se pretende que el motor trabaje siempre con la máxima eficiencia será necesario realizar su control vectorial, por lo que el modelo que se implemente debe ser también vectorial. Ahora bien, en el modelo vectorial estándar que normalmente se utiliza en los esquemas de control se consideran nulas las pérdidas en el hierro, por lo que sólo se podrá utilizar ese modelo para evaluar la eficiencia del motor si previamente se modifica para que incluya el efecto de dichas pérdidas. 5. Desarrollar un procedimiento de control para que el inversor consiga que el motor trabaje con mínimas pérdidas y a la vez el generador entregue la máxima potencia. Tal como se ha mencionado en el primer objetivo, se trata de establecer un procedimiento de control que determine las señales de consigna más convenientes para que el inversor pueda imponer en cada momento al motor las corrientes de estator para las que sus pérdidas son mínimas. Al mismo tiempo el procedimiento de control debe ser capaz de variar las señales de consigna que recibe el inversor para que éste pueda hacer que el motor demande más o menos potencia al generador fotovoltaico. Actuando de esa forma se puede lograr que el generador fotovoltaico trabaje entregando la máxima potencia. El procedimiento de control desarrollado se implementará en un DSP encargado de generar las señales de referencia que el inversor debe imponer al motor para que trabaje con mínimas pérdidas y a la vez el generador fotovoltaico entregue la máxima potencia.

Active Optical Sensors for Tree Stem Detection and Classification in Nurseries.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Active optical sensing (LIDAR and light curtain transmission) devices mounted on a mobile platform can correctly detect, localize, and classify trees. To conduct an evaluation and comparison of the different sensors, an optical encoder wheel was used for vehicle odometry and provided a measurement of the linear displacement of the prototype vehicle along a row of tree seedlings as a reference for each recorded sensor measurement. The field trials were conducted in a juvenile tree nursery with one-year-old grafted almond trees at Sierra Gold Nurseries, Yuba City, CA, United States. Through these tests and subsequent data processing, each sensor was individually evaluated to characterize their reliability, as well as their advantages and disadvantages for the proposed task. Test results indicated that 95.7% and 99.48% of the trees were successfully detected with the LIDAR and light curtain sensors, respectively. LIDAR correctly classified, between alive or dead tree states at a 93.75% success rate compared to 94.16% for the light curtain sensor. These results can help system designers select the most reliable sensor for the accurate detection and localization of each tree in a nursery, which might allow labor-intensive tasks, such as weeding, to be automated without damaging crops.

3D Videoconferencing system using spatial scalability

Relevância:

10.00% 10.00%

Publicador:

Resumo:

An implementation of a real-time 3D videoconferencing system using the currently available technology is presented. This appr oach is based on the side by side spatial compression of the stereoscopic images . The encoder and the decoder have b een implemented in a standard personal computer and a conventional 3D comp atible TV has been used to present the frames. Moreover, the users without 3D technology can use the system because 2D compatibility mode has been implemented in the decoder. The performance res ults show that a conventional computer can be used for encod ing/decoding audio and video streams and the delay in the transmission is lower than 200 ms.

A framework for the analysis and optimization of encoding latency for multiview video

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a novel framework for the analysis and optimization of encoding latency for multiview video. Firstly, we characterize the elements that have an influence in the encoding latency performance: (i) the multiview prediction structure and (ii) the hardware encoder model. Then, we provide algorithms to find the encoding latency of any arbitrary multiview prediction structure. The proposed framework relies on the directed acyclic graph encoder latency (DAGEL) model, which provides an abstraction of the processing capacity of the encoder by considering an unbounded number of processors. Using graph theoretic algorithms, the DAGEL model allows us to compute the encoding latency of a given prediction structure, and determine the contribution of the prediction dependencies to it. As an example of DAGEL application, we propose an algorithm to reduce the encoding latency of a given multiview prediction structure up to a target value. In our approach, a minimum number of frame dependencies are pruned, until the latency target value is achieved, thus minimizing the degradation of the rate-distortion performance due to the removal of the prediction dependencies. Finally, we analyze the latency performance of the DAGEL derived prediction structures in multiview encoders with limited processing capacity.

Energy Management and Optimization for Video Decoders based on Functional-oriented Reconfiguration

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In recent years, the increasing sophistication of embedded multimedia systems and wireless communication technologies has promoted a widespread utilization of video streaming applications. It has been reported in 2013 that youngsters, aged between 13 and 24, spend around 16.7 hours a week watching online video through social media, business websites, and video streaming sites. Video applications have already been blended into people daily life. Traditionally, video streaming research has focused on performance improvement, namely throughput increase and response time reduction. However, most mobile devices are battery-powered, a technology that grows at a much slower pace than either multimedia or hardware developments. Since battery developments cannot satisfy expanding power demand of mobile devices, research interests on video applications technology has attracted more attention to achieve energy-efficient designs. How to efficiently use the limited battery energy budget becomes a major research challenge. In addition, next generation video standards impel to diversification and personalization. Therefore, it is desirable to have mechanisms to implement energy optimizations with greater flexibility and scalability. In this context, the main goal of this dissertation is to find an energy management and optimization mechanism to reduce the energy consumption of video decoders based on the idea of functional-oriented reconfiguration. System battery life is prolonged as the result of a trade-off between energy consumption and video quality. Functional-oriented reconfiguration takes advantage of the similarities among standards to build video decoders reconnecting existing functional units. If a feedback channel from the decoder to the encoder is available, the former can signal the latter changes in either the encoding parameters or the encoding algorithms for energy-saving adaption. The proposed energy optimization and management mechanism is carried out at the decoder end. This mechanism consists of an energy-aware manager, implemented as an additional block of the reconfiguration engine, an energy estimator, integrated into the decoder, and, if available, a feedback channel connected to the encoder end. The energy-aware manager checks the battery level, selects the new decoder description and signals to build a new decoder to the reconfiguration engine. It is worth noting that the analysis of the energy consumption is fundamental for the success of the energy management and optimization mechanism. In this thesis, an energy estimation method driven by platform event monitoring is proposed. In addition, an event filter is suggested to automate the selection of the most appropriate events that affect the energy consumption. At last, a detailed study on the influence of the training data on the model accuracy is presented. The modeling methodology of the energy estimator has been evaluated on different underlying platforms, single-core and multi-core, with different characteristics of workload. All the results show a good accuracy and low on-line computation overhead. The required modifications on the reconfiguration engine to implement the energy-aware manager have been assessed under different scenarios. The results indicate a possibility to lengthen the battery lifetime of the system in two different use-cases.

Diseño y banco de test VHDL de un procesador que realice la ICT para HEVC

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Se va a realizar un estudio de la codificación de imágenes sobre el estándar HEVC (high-effiency video coding). El proyecto se va a centrar en el codificador híbrido, más concretamente sobre la aplicación de la transformada inversa del coseno que se realiza tanto en codificador como en el descodificador. La necesidad de codificar vídeo surge por la aparición de la secuencia de imágenes como señales digitales. El problema principal que tiene el vídeo es la cantidad de bits que aparecen al realizar la codificación. Como consecuencia del aumento de la calidad de las imágenes, se produce un crecimiento exponencial de la cantidad de información a codificar. La utilización de las transformadas al procesamiento digital de imágenes ha aumentado a lo largo de los años. La transformada inversa del coseno se ha convertido en el método más utilizado en el campo de la codificación de imágenes y video. Las ventajas de la transformada inversa del coseno permiten obtener altos índices de compresión a muy bajo coste. La teoría de las transformadas ha mejorado el procesamiento de imágenes. En la codificación por transformada, una imagen se divide en bloques y se identifica cada imagen a un conjunto de coeficientes. Esta codificación se aprovecha de las dependencias estadísticas de las imágenes para reducir la cantidad de datos. El proyecto realiza un estudio de la evolución a lo largo de los años de los distintos estándares de codificación de video. Se analiza el codificador híbrido con más profundidad así como el estándar HEVC. El objetivo final que busca este proyecto fin de carrera es la realización del núcleo de un procesador específico para la ejecución de la transformada inversa del coseno en un descodificador de vídeo compatible con el estándar HEVC. Es objetivo se logra siguiendo una serie de etapas, en las que se va añadiendo requisitos. Este sistema permite al diseñador hardware ir adquiriendo una experiencia y un conocimiento más profundo de la arquitectura final. ABSTRACT. A study about the codification of images based on the standard HEVC (high-efficiency video coding) will be developed. The project will be based on the hybrid encoder, in particular, on the application of the inverse cosine transform, which is used for the encoder as well as for the decoder. The necessity of encoding video arises because of the appearance of the sequence of images as digital signals. The main problem that video faces is the amount of bits that appear when making the codification. As a consequence of the increase of the quality of the images, an exponential growth on the quantity of information that should be encoded happens. The usage of transforms to the digital processing of images has increased along the years. The inverse cosine transform has become the most used method in the field of codification of images and video. The advantages of the inverse cosine transform allow to obtain high levels of comprehension at a very low price. The theory of the transforms has improved the processing of images. In the codification by transform, an image is divided in blocks and each image is identified to a set of coefficients. This codification takes advantage of the statistic dependence of the images to reduce the amount of data. The project develops a study of the evolution along the years of the different standards in video codification. In addition, the hybrid encoder and the standard HEVC are analyzed more in depth. The final objective of this end of degree project is the realization of the nucleus from a specific processor for the execution of the inverse cosine transform in a decoder of video that is compatible with the standard HEVC. This objective is reached following a series of stages, in which requirements are added. This system allows the hardware designer to acquire a deeper experience and knowledge of the final architecture.

Partición hardware software de un codificador JPEG utilizando escalador de colinas estocástico

Relevância:

10.00% 10.00%

Publicador:

Resumo:

La partición hardware/software es una etapa clave dentro del proceso de co-diseño de los sistemas embebidos. En esta etapa se decide qué componentes serán implementados como co-procesadores de hardware y qué componentes serán implementados en un procesador de propósito general. La decisión es tomada a partir de la exploración del espacio de diseño, evaluando un conjunto de posibles soluciones para establecer cuál de estas es la que mejor balance logra entre todas las métricas de diseño. Para explorar el espacio de soluciones, la mayoría de las propuestas, utilizan algoritmos metaheurísticos; destacándose los Algoritmos Genéticos, Recocido Simulado. Esta decisión, en muchos casos, no es tomada a partir de análisis comparativos que involucren a varios algoritmos sobre un mismo problema. En este trabajo se presenta la aplicación de los algoritmos: Escalador de Colinas Estocástico y Escalador de Colinas Estocástico con Reinicio, para resolver el problema de la partición hardware/software. Para validar el empleo de estos algoritmos se presenta la aplicación de este algoritmo sobre un caso de estudio, en particular la partición hardware/software de un codificador JPEG. En todos los experimentos es posible apreciar que ambos algoritmos alcanzan soluciones comparables con las obtenidas por los algoritmos utilizados con más frecuencia.

Designing Regularizers and Architectures for Recurrent Neural Networks

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Cette thèse contribue a la recherche vers l'intelligence artificielle en utilisant des méthodes connexionnistes. Les réseaux de neurones récurrents sont un ensemble de modèles séquentiels de plus en plus populaires capable en principe d'apprendre des algorithmes arbitraires. Ces modèles effectuent un apprentissage en profondeur, un type d'apprentissage machine. Sa généralité et son succès empirique en font un sujet intéressant pour la recherche et un outil prometteur pour la création de l'intelligence artificielle plus générale. Le premier chapitre de cette thèse donne un bref aperçu des sujets de fonds: l'intelligence artificielle, l'apprentissage machine, l'apprentissage en profondeur et les réseaux de neurones récurrents. Les trois chapitres suivants couvrent ces sujets de manière de plus en plus spécifiques. Enfin, nous présentons quelques contributions apportées aux réseaux de neurones récurrents. Le chapitre \ref{arxiv1} présente nos travaux de régularisation des réseaux de neurones récurrents. La régularisation vise à améliorer la capacité de généralisation du modèle, et joue un role clé dans la performance de plusieurs applications des réseaux de neurones récurrents, en particulier en reconnaissance vocale. Notre approche donne l'état de l'art sur TIMIT, un benchmark standard pour cette tâche. Le chapitre \ref{cpgp} présente une seconde ligne de travail, toujours en cours, qui explore une nouvelle architecture pour les réseaux de neurones récurrents. Les réseaux de neurones récurrents maintiennent un état caché qui représente leurs observations antérieures. L'idée de ce travail est de coder certaines dynamiques abstraites dans l'état caché, donnant au réseau une manière naturelle d'encoder des tendances cohérentes de l'état de son environnement. Notre travail est fondé sur un modèle existant; nous décrivons ce travail et nos contributions avec notamment une expérience préliminaire.

Designing Regularizers and Architectures for Recurrent Neural Networks

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Cette thèse contribue a la recherche vers l'intelligence artificielle en utilisant des méthodes connexionnistes. Les réseaux de neurones récurrents sont un ensemble de modèles séquentiels de plus en plus populaires capable en principe d'apprendre des algorithmes arbitraires. Ces modèles effectuent un apprentissage en profondeur, un type d'apprentissage machine. Sa généralité et son succès empirique en font un sujet intéressant pour la recherche et un outil prometteur pour la création de l'intelligence artificielle plus générale. Le premier chapitre de cette thèse donne un bref aperçu des sujets de fonds: l'intelligence artificielle, l'apprentissage machine, l'apprentissage en profondeur et les réseaux de neurones récurrents. Les trois chapitres suivants couvrent ces sujets de manière de plus en plus spécifiques. Enfin, nous présentons quelques contributions apportées aux réseaux de neurones récurrents. Le chapitre \ref{arxiv1} présente nos travaux de régularisation des réseaux de neurones récurrents. La régularisation vise à améliorer la capacité de généralisation du modèle, et joue un role clé dans la performance de plusieurs applications des réseaux de neurones récurrents, en particulier en reconnaissance vocale. Notre approche donne l'état de l'art sur TIMIT, un benchmark standard pour cette tâche. Le chapitre \ref{cpgp} présente une seconde ligne de travail, toujours en cours, qui explore une nouvelle architecture pour les réseaux de neurones récurrents. Les réseaux de neurones récurrents maintiennent un état caché qui représente leurs observations antérieures. L'idée de ce travail est de coder certaines dynamiques abstraites dans l'état caché, donnant au réseau une manière naturelle d'encoder des tendances cohérentes de l'état de son environnement. Notre travail est fondé sur un modèle existant; nous décrivons ce travail et nos contributions avec notamment une expérience préliminaire.

High-fidelity Z-measurement error encoding of optical qubits

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We demonstrate a quantum error correction scheme that protects against accidental measurement, using a parity encoding where the logical state of a single qubit is encoded into two physical qubits using a nondeterministic photonic controlled-NOT gate. For the single qubit input states vertical bar 0 >, vertical bar 1 >, vertical bar 0 > +/- vertical bar 1 >, and vertical bar 0 > +/- i vertical bar 1 > our encoder produces the appropriate two-qubit encoded state with an average fidelity of 0.88 +/- 0.03 and the single qubit decoded states have an average fidelity of 0.93 +/- 0.05 with the original state. We are able to decode the two-qubit state (up to a bit flip) by performing a measurement on one of the qubits in the logical basis; we find that the 64 one-qubit decoded states arising from 16 real and imaginary single-qubit superposition inputs have an average fidelity of 0.96 +/- 0.03.

A new unsupervised convolutional neural network model for Chinese scene text detection

Relevância:

10.00% 10.00%

Publicador:

Resumo:

As one of the most popular deep learning models, convolution neural network (CNN) has achieved huge success in image information extraction. Traditionally CNN is trained by supervised learning method with labeled data and used as a classifier by adding a classification layer in the end. Its capability of extracting image features is largely limited due to the difficulty of setting up a large training dataset. In this paper, we propose a new unsupervised learning CNN model, which uses a so-called convolutional sparse auto-encoder (CSAE) algorithm pre-Train the CNN. Instead of using labeled natural images for CNN training, the CSAE algorithm can be used to train the CNN with unlabeled artificial images, which enables easy expansion of training data and unsupervised learning. The CSAE algorithm is especially designed for extracting complex features from specific objects such as Chinese characters. After the features of articficial images are extracted by the CSAE algorithm, the learned parameters are used to initialize the first CNN convolutional layer, and then the CNN model is fine-Trained by scene image patches with a linear classifier. The new CNN model is applied to Chinese scene text detection and is evaluated with a multilingual image dataset, which labels Chinese, English and numerals texts separately. More than 10% detection precision gain is observed over two CNN models.

Self-tuning algorithm for dsp-based motion control in cnc applications

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis describes the development of an adaptive control algorithm for Computerized Numerical Control (CNC) machines implemented in a multi-axis motion control board based on the TMS320C31 DSP chip. The adaptive process involves two stages: Plant Modeling and Inverse Control Application. The first stage builds a non-recursive model of the CNC system (plant) using the Least-Mean-Square (LMS) algorithm. The second stage consists of the definition of a recursive structure (the controller) that implements an inverse model of the plant by using the coefficients of the model in an algorithm called Forward-Time Calculation (FTC). In this way, when the inverse controller is implemented in series with the plant, it will pre-compensate for the modification that the original plant introduces in the input signal. The performance of this solution was verified at three different levels: Software simulation, implementation in a set of isolated motor-encoder pairs and implementation in a real CNC machine. The use of the adaptive inverse controller effectively improved the step response of the system in all three levels. In the simulation, an ideal response was obtained. In the motor-encoder test, the rise time was reduced by as much as 80%, without overshoot, in some cases. Even with the larger mass of the actual CNC machine, decrease of the rise time and elimination of the overshoot were obtained in most cases. These results lead to the conclusion that the adaptive inverse controller is a viable approach to position control in CNC machinery.

«
1
2
3
4
5
6
»