888 resultados para Parallel processing (Electronic computers) - Research


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis develops high performance real-time signal processing modules for direction of arrival (DOA) estimation for localization systems. It proposes highly parallel algorithms for performing subspace decomposition and polynomial rooting, which are otherwise traditionally implemented using sequential algorithms. The proposed algorithms address the emerging need for real-time localization for a wide range of applications. As the antenna array size increases, the complexity of signal processing algorithms increases, making it increasingly difficult to satisfy the real-time constraints. This thesis addresses real-time implementation by proposing parallel algorithms, that maintain considerable improvement over traditional algorithms, especially for systems with larger number of antenna array elements. Singular value decomposition (SVD) and polynomial rooting are two computationally complex steps and act as the bottleneck to achieving real-time performance. The proposed algorithms are suitable for implementation on field programmable gated arrays (FPGAs), single instruction multiple data (SIMD) hardware or application specific integrated chips (ASICs), which offer large number of processing elements that can be exploited for parallel processing. The designs proposed in this thesis are modular, easily expandable and easy to implement. Firstly, this thesis proposes a fast converging SVD algorithm. The proposed method reduces the number of iterations it takes to converge to correct singular values, thus achieving closer to real-time performance. A general algorithm and a modular system design are provided making it easy for designers to replicate and extend the design to larger matrix sizes. Moreover, the method is highly parallel, which can be exploited in various hardware platforms mentioned earlier. A fixed point implementation of proposed SVD algorithm is presented. The FPGA design is pipelined to the maximum extent to increase the maximum achievable frequency of operation. The system was developed with the objective of achieving high throughput. Various modern cores available in FPGAs were used to maximize the performance and details of these modules are presented in detail. Finally, a parallel polynomial rooting technique based on Newton’s method applicable exclusively to root-MUSIC polynomials is proposed. Unique characteristics of root-MUSIC polynomial’s complex dynamics were exploited to derive this polynomial rooting method. The technique exhibits parallelism and converges to the desired root within fixed number of iterations, making this suitable for polynomial rooting of large degree polynomials. We believe this is the first time that complex dynamics of root-MUSIC polynomial were analyzed to propose an algorithm. In all, the thesis addresses two major bottlenecks in a direction of arrival estimation system, by providing simple, high throughput, parallel algorithms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the development and capabilities of the Smart Home system, people today are entering an era in which household appliances are no longer just controlled by people, but also operated by a Smart System. This results in a more efficient, convenient, comfortable, and environmentally friendly living environment. A critical part of the Smart Home system is Home Automation, which means that there is a Micro-Controller Unit (MCU) to control all the household appliances and schedule their operating times. This reduces electricity bills by shifting amounts of power consumption from the on-peak hour consumption to the off-peak hour consumption, in terms of different “hour price”. In this paper, we propose an algorithm for scheduling multi-user power consumption and implement it on an FPGA board, using it as the MCU. This algorithm for discrete power level tasks scheduling is based on dynamic programming, which could find a scheduling solution close to the optimal one. We chose FPGA as our system’s controller because FPGA has low complexity, parallel processing capability, a large amount of I/O interface for further development and is programmable on both software and hardware. In conclusion, it costs little time running on FPGA board and the solution obtained is good enough for the consumers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this work we propose an image acquisition and processing methodology (framework) developed for performance in-field grapes and leaves detection and quantification, based on a six step methodology: 1) image segmentation through Fuzzy C-Means with Gustafson Kessel (FCM-GK) clustering; 2) obtaining of FCM-GK outputs (centroids) for acting as seeding for K-Means clustering; 3) Identification of the clusters generated by K-Means using a Support Vector Machine (SVM) classifier. 4) Performance of morphological operations over the grapes and leaves clusters in order to fill holes and to eliminate small pixels clusters; 5)Creation of a mosaic image by Scale-Invariant Feature Transform (SIFT) in order to avoid overlapping between images; 6) Calculation of the areas of leaves and grapes and finding of the centroids in the grape bunches. Image data are collected using a colour camera fixed to a mobile platform. This platform was developed to give a stabilized surface to guarantee that the images were acquired parallel to de vineyard rows. In this way, the platform avoids the distortion of the images that lead to poor estimation of the areas. Our preliminary results are promissory, although they still have shown that it is necessary to implement a camera stabilization system to avoid undesired camera movements, and also a parallel processing procedure in order to speed up the mosaicking process.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Distributed parallel execution systems speed up applications by splitting tasks into processes whose execution is assigned to different receiving nodes in a high-bandwidth network. On the distributing side, a fundamental problem is grouping and scheduling such tasks such that each one involves sufñcient computational cost when compared to the task creation and communication costs and other such practical overheads. On the receiving side, an important issue is to have some assurance of the correctness and characteristics of the code received and also of the kind of load the particular task is going to pose, which can be specified by means of certificates. In this paper we present in a tutorial way a number of general solutions to these problems, and illustrate them through their implementation in the Ciao multi-paradigm language and program development environment. This system includes facilities for parallel and distributed execution, an assertion language for specifying complex programs properties (including safety and resource-related properties), and compile-time and run-time tools for performing automated parallelization and resource control, as well as certification of programs with resource consumption assurances and efñcient checking of such certificates.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Irregular computations pose some of the most interesting and challenging problems in automatic parallelization. Irregularity appears in certain kinds of numerical problems and is pervasive in symbolic applications. Such computations often use dynamic data structures which make heavy use of pointers. This complicates all the steps of a parallelizing compiler, from independence detection to task partitioning and placement. In the past decade there has been significant progress in the development of parallelizing compilers for logic programming and, more recently, constraint programming. The typical applications of these paradigms frequently involve irregular computations, which arguably makes the techniques used in these compilers potentially interesting. In this paper we introduce in a tutorial way some of the problems faced by parallelizing compilers for logic and constraint programs. These include the need for inter-procedural pointer aliasing analysis for independence detection and having to manage speculative and irregular computations through task granularity control and dynamic task allocation. We also provide pointers to some of the progress made in these áreas. In the associated talk we demónstrate representatives of several generations of these parallelizing compilers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Adaptive embedded systems are required in various applications. This work addresses these needs in the area of adaptive image compression in FPGA devices. A simplified version of an evolution strategy is utilized to optimize wavelet filters of a Discrete Wavelet Transform algorithm. We propose an adaptive image compression system in FPGA where optimized memory architecture, parallel processing and optimized task scheduling allow reducing the time of evolution. The proposed solution has been extensively evaluated in terms of the quality of compression as well as the processing time. The proposed architecture reduces the time of evolution by 44% compared to our previous reports while maintaining the quality of compression unchanged with respect to existing implementations. The system is able to find an optimized set of wavelet filters in less than 2 min whenever the input type of data changes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a proposal for an advanced system of debate in an environment of digital democracy which overcomes the limitations of existing systems. We have been especially careful in applying security procedures in telematic systems, for they are to offer citizens the guarantees that society demands. New functional tools have been included to ensure user authentication and to permit anonymous participation where the system is unable to disclose or even to know the identity of system users. The platform prevents participation by non-entitled persons who do not belong to the authorized group from giving their opinion. Furthermore, this proposal allows for verifying the proper function of the system, free of tampering or fraud intended to alter the conclusions or outcomes of participation. All these tools guarantee important aspects of both a social and technical nature, most importantly: freedom of expression, equality and auditability.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Los conjuntos bacterianos son sistemas dinámicos difíciles de modelar debido a que las bacterias colaboran e intercambian información entre sí. Estos microorganismos procariotas pueden tomar decisiones por mayoría e intercambiar información genética importante que, por ejemplo, las haga resistentes a un antibiótico. El proceso de conjugación consiste en el intercambio de un plásmido de una bacteria con otra, permitiendo así que se transfieran propiedades. Estudios recientes han demostrado que estos plásmidos pueden ser reprogramados artificialmente para que la bacteria que lo contenga realice una función específica [1]. Entre la multitud de aplicaciones que supone esta idea, el proyecto europeo PLASWIRES está intentando demostrar que es posible usar organismos vivos como computadores distribuidos en paralelo y plásmidos como conexión entre ellos mediante conjugación. Por tanto, mediante una correcta programación de un plásmido, se puede conseguir, por ejemplo, hacer que una colonia de bacterias haga la función de un antibiótico o detecte otros plásmidos peligrosos en bacterias virulentas. El proceso experimental para demostrar esta idea puede llegar a ser algo lento y tedioso, por lo que es necesario el uso de simuladores que predigan su comportamiento. Debido a que el proyecto PLASWIRES se basa en la conjugación bacteriana, surge la necesidad de un simulador que reproduzca esta operación. El presente trabajo surge debido a la deficiencia del simulador GRO para reproducir la conjugación. En este documento se detallan las modificaciones necesarias para que GRO pueda representar este proceso, así como analizar los datos obtenidos e intentar ajustar el modelo a los datos obtenidos por el Instituto de Biomedicina y Biotecnología de Cantabria (IBBTEC). ---ABSTRACT---Bacterial colonies are dynamical systems difficult to model because bacteria collaborate and exchange information with each other. These prokaryotic organisms can make decisions by majority and exchange important genetic information, for example, make them resistant to an antibiotic. The conjugation process is the exchange of a plasmid from one bacterium to another, allowing both to have the same properties. Recent studies have shown that these plasmids can be artificially reprogrammed to make the bacteria that contain it to perform a specific function [1]. Among the multitude of applications involved in this idea, the European project PLASWIRES is attempting to prove that it is possible to use living organisms as parallel and distributed computers with plasmids acting as connectors between them through conjugation. Thus, by properly programming a plasmid, you can get a colony of bacteria that work as an antibiotic or detect hazardous plasmids in virulent bacteria. The experimental process to prove this idea can be slow and tedious, so the use of simulators to predict their behavior is required. Since PLASWIRES project is based on bacterial conjugation, a simulator that can reproduce this operation is required. This work arises due to the absence of the conjugation process in the simulator GRO. This document details the changes made to GRO to represent this process, analyze the data and try to adjust the model to the data obtained by the Institute of Biomedicine and Biotechnology of Cantabria ( IBBTEC ). This project has two main objectives, the first is to add the functionality of intercellular communication by conjugation to the simulator GRO, and the second is to use the experimental data obtained by the IBBTEC. To do this, the following points should be followed: • Study of conjugation biology as a mechanism of intercellular communication. • Design and implementation of the algorithm that simulates conjugation. • Experimental validation and model adjust to the experimental data on rates of conjugation and bacterial growth.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Parallel processing systems require complex interconnection networks. In order to obtain fast and flexible communications at a reasonable cost, different types of networks has been studied in the past. None of them can be considered best. The cost-effectiveness of a particular network design depends of several factors that will not be treat here. Nevertheless, the basic device that configurate an interconnection network can be the same for most of them. In this way, an Optical Interconetion Network made with Holographic Optical Element (HOE) is presented. The HOE recording way use present special caracteristics that are described. A Perfect Shuffle and Banyan networks has been implemented.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Las nuevas tendencias de compartir archivos multimedia a través de redes abiertas, demanda el uso de mejores técnicas de encriptación que garanticen la integridad, disponibilidad y confidencialidad, manteniendo y/o mejorando la eficiencia del proceso de cifrado sobre estos archivos. Hoy en día es frecuente la transferencia de imágenes a través de medios tecnológicos, siendo necesario la actualización de las técnicas de encriptación existentes y mejor aún, la búsqueda de nuevas alternativas. Actualmente los algoritmos criptográficos clásicos son altamente conocidos en medio de la sociedad informática lo que provoca mayor vulnerabilidad, sin contar los altos tiempos de procesamiento al momento de ser utilizados, elevando la probabilidad de ser descifrados y minimizando la disponibilidad inmediata de los recursos. Para disminuir estas probabilidades, el uso de la teoría de caos surge como una buena opción para ser aplicada en un algoritmo que tome partida del comportamiento caótico de los sistemas dinámicos, y aproveche las propiedades de los mapas logísticos para elevar el nivel de robustez en el cifrado. Es por eso que este trabajo propone la creación de un sistema criptográfico basado sobre una arquitectura dividida en dos etapas de confusión y difusión. Cada una de ellas utiliza una ecuación logística para generar números pseudoaleatorios que permitan desordenar la posición del píxel y cambiar su intensidad en la escala de grises. Este proceso iterativo es determinado por la cantidad total de píxeles de una imagen. Finalmente, toda la lógica de cifrado es ejecutada sobre la tecnología CUDA que permite el procesamiento en paralelo. Como aporte sustancial, se propone una nueva técnica de encriptación vanguardista de alta sensibilidad ante ruidos externos manteniendo no solo la confidencialidad de la imagen, sino también la disponibilidad y la eficiencia en los tiempos de proceso.---ABSTRACT---New trends to share multimedia files over open networks, demand the best use of encryption techniques to ensure the integrity, availability and confidentiality, keeping and/or improving the efficiency of the encryption process on these files. Today it is common to transfer pictures through technological networks, thus, it is necessary to update existing techniques encryption, and even better, the searching of new alternatives. Nowadays, classic cryptographic algorithms are highly known in the midst of the information society which not only causes greater vulnerability, but high processing times when this algorithms are used. It raise the probability of being deciphered and minimizes the immediate availability of resources. To reduce these odds, the use of chaos theory emerged as a good option to be applied on an algorithm that takes advantage of chaotic behavior of dynamic systems, and take logistic maps’ properties to raise the level of robustness in the encryption. That is why this paper proposes the creation of a cryptographic system based on an architecture divided into two stages: confusion and diffusion. Each stage uses a logistic equation to generate pseudorandom numbers that allow mess pixel position and change their intensity in grayscale. This iterative process is determined by the total number of pixels of an image. Finally, the entire encryption logic is executed on the CUDA technology that enables parallel processing. As a substantial contribution, it propose a new encryption technique with high sensitivity on external noise not only keeping the confidentiality of the image, but also the availability and efficiency in processing times.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

La motivación de esta tesis es el desarrollo de una herramienta de optimización automática para la mejora del rendimiento de formas aerodinámicas enfocado en la industria aeronáutica. Este trabajo cubre varios aspectos esenciales, desde el empleo de Non-Uniform Rational B-Splines (NURBS), al cálculo de gradientes utilizando la metodología del adjunto continuo, el uso de b-splines volumétricas como parámetros de diseño, el tratamiento de la malla en las intersecciones, y no menos importante, la adaptación de los algoritmos de la dinámica de fluidos computacional (CFD) en arquitecturas hardware de alto paralelismo, como las tarjetas gráficas, para acelerar el proceso de optimización. La metodología adjunta ha posibilitado que los métodos de optimización basados en gradientes sean una alternativa prometedora para la mejora de la eficiencia aerodinámica de los aviones. La formulación del adjunto permite calcular los gradientes de una función de coste, como la resistencia aerodinámica o la sustentación, independientemente del número de variables de diseño, a un coste computacional equivalente a una simulación CFD. Sin embargo, existen problemas prácticos que han imposibilitado su aplicación en la industria, que se pueden resumir en: integrabilidad, rendimiento computacional y robustez de la solución adjunta. Este trabajo aborda estas contrariedades y las analiza en casos prácticos. Como resumen, las contribuciones de esta tesis son: • El uso de NURBS como variables de diseño en un bucle de automático de optimización, aplicado a la mejora del rendimiento aerodinámico de alas en régimen transónico. • El desarrollo de algoritmos de inversión de punto, para calcular las coordenadas paramétricas de las coordenadas espaciales, para ligar los vértices de malla a las NURBS. • El uso y validación de la formulación adjunta para el calculo de los gradientes, a partir de las sensibilidades de la solución adjunta, comparado con diferencias finitas. • Se ofrece una estrategia para utilizar la geometría CAD, en forma de parches NURBS, para tratar las intersecciones, como el ala-fuselaje. • No existen muchas alternativas de librerías NURBS viables. En este trabajo se ha desarrollado una librería, DOMINO NURBS, y se ofrece a la comunidad como código libre y abierto. • También se ha implementado un código CFD en tarjeta gráfica, para realizar una valoración de cómo se puede adaptar un código sobre malla no estructurada a arquitecturas paralelas. • Finalmente, se propone una metodología, basada en la función de Green, como una forma eficiente de paralelizar simulaciones numéricas. Esta tesis ha sido apoyada por las actividades realizadas por el Área de Dinámica da Fluidos del Instituto Nacional de Técnica Aeroespacial (INTA), a través de numerosos proyectos de financiación nacional: DOMINO, SIMUMAT, y CORESFMULAERO. También ha estado en consonancia con las actividades realizadas por el departamento de Métodos y Herramientas de Airbus España y con el grupo Investigación y Tecnología Aeronáutica Europeo (GARTEUR), AG/52. ABSTRACT The motivation of this work is the development of an automatic optimization strategy for large scale shape optimization problems that arise in the aeronautics industry to improve the aerodynamic performance; covering several aspects from the use of Non-Uniform Rational B-Splines (NURBS), the calculation of the gradients with the continuous adjoint formulation, the development of volumetric b-splines parameterization, mesh adaptation and intersection handling, to the adaptation of Computational Fluid Dynamics (CFD) algorithms to take advantage of highly parallel architectures in order to speed up the optimization process. With the development of the adjoint formulation, gradient-based methods for aerodynamic optimization become a promising approach to improve the aerodynamic performance of aircraft designs. The adjoint methodology allows the evaluation the gradients to all design variables of a cost function, such as drag or lift, at the equivalent cost of more or less one CFD simulation. However, some practical problems have been delaying its full implementation to the industry, which can be summarized as: integrability, computer performance, and adjoint robustness. This work tackles some of these issues and analyse them in well-known test cases. As summary, the contributions comprises: • The employment of NURBS as design variables in an automatic optimization loop for the improvement of the aerodynamic performance of aircraft wings in transonic regimen. • The development of point inversion algorithms to calculate the NURBS parametric coordinates from the space coordinates, to link with the computational grid vertex. • The use and validation of the adjoint formulation to calculate the gradients from the surface sensitivities in an automatic optimization loop and evaluate its reliability, compared with finite differences. • This work proposes some algorithms that take advantage of the underlying CAD geometry description, in the form of NURBS patches, to handle intersections and mesh adaptations. • There are not many usable libraries for NURBS available. In this work an open source library DOMINO NURBS has been developed and is offered to the community as free, open source code. • The implementation of a transonic CFD solver from scratch in a graphic card, for an assessment of the implementability of conventional CFD solvers for unstructured grids to highly parallel architectures. • Finally, this research proposes the use of the Green's function as an efficient paralellization scheme of numerical solvers. The presented work has been supported by the activities carried out at the Fluid Dynamics branch of the National Institute for Aerospace Technology (INTA) through national founding research projects: DOMINO, SIMUMAT, and CORESIMULAERO; in line with the activities carried out by the Methods and Tools and Flight Physics department at Airbus and the Group for Aeronautical Research and Technology in Europe (GARTEUR) action group AG/52.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Esta tesis presenta un modelo, una metodología, una arquitectura, varios algoritmos y programas para crear un lexicón de sentimientos unificado (LSU) que cubre cuatro lenguas: inglés, español, portugués y chino. El objetivo principal es alinear, unificar, y expandir el conjunto de lexicones de sentimientos disponibles en Internet y los desarrollados a lo largo de esta investigación. Así, el principal problema a resolver es la tarea de unificar de forma automatizada los diferentes lexicones de sentimientos obtenidos por el crawler CSR, porque la unidad de medida para asignar la intensidad de los valores de la polaridad (de forma manual, semiautomática y automática) varía de acuerdo con las diferentes metodologías utilizadas para la construcción de cada lexicón. La representación codificada de la estructura de datos de los términos presenta también una variación en la estructura de lexicón a lexicón. Por lo que al unificar en un lexicón de sentimientos se hace posible la reutilización del conocimiento recopilado por los diferentes grupos de investigación y se incrementa, a la vez, el alcance, la calidad y la robustez de los lexicones. Nuestra metodología LSU calcula un valor unificado de la intensidad de la polaridad para cada entrada léxica que está presente en al menos dos de los lexicones de sentimientos que forman parte de este estudio. En contraste, las entradas léxicas que no son comunes en al menos dos de los lexicones conservan su valor original. El coeficiente de Pearson resultante permite medir la correlación existente entre las entradas léxicas asignándoles un rango de valores de uno a menos uno, donde uno indica que los valores de los términos están perfectamente correlacionados, cero indica que no existe correlación y menos uno significa que están inversamente correlacionados. Este procedimiento se lleva acabo con la función de MetricasUnificadas tanto en la CPU como en la GPU. Otro problema a resolver es el tiempo de procesamiento que se requiere para realizar la tarea de unificación de la intensidad de la polaridad y con ello alcanzar una cobertura mayor de lemas en los lexicones de sentimientos existentes. Asimismo, la metodología LSU utiliza el procesamiento paralelo para unificar los 155 802 términos. El algoritmo LSU procesa mediante cargas iguales el subconjunto de entradas léxicas en cada uno de los 1344 núcleos en la GPU. Los resultados de nuestro análisis arrojaron un total de 95 430 entradas léxicas donde 35 201 obtuvieron valores positivos, 22 029 negativos y 38 200 neutrales. Finalmente, el tiempo de ejecución fue de 2,506 segundos para el total de las entradas léxicas, lo que permitió reducir el procesamiento de cómputo hasta en una tercera parte con respecto al algoritmo secuencial. De estos resultados se concluye que al lograr un lexicón de sentimientos unificado que permite homogeneizar la intensidad de la polaridad de las unidades léxicas (con valores positivos, negativos y neutrales) deriva no sólo en el análisis semántico del corpus basado en los términos con una mayor carga de polaridad, o del resumen de las valoraciones o las tendencias de neuromarketing, sino también en aplicaciones como el etiquetado subjetivo de sitios web o de portales sintácticos y semánticos, por mencionar algunas. ABSTRACT This thesis presents an approach to create what we have called a Unified Sentiment Lexicon (USL). This approach aims at aligning, unifying, and expanding the set of sentiment lexicons which are available on the web in order to increase their robustness of coverage. One problem related to the task of the automatic unification of different scores of sentiment lexicons is that there are multiple lexical entries for which the classification of positive, negative, or neutral P, N, Z depends on the unit of measurement used in the annotation methodology of the source sentiment lexicon. Our USL approach computes the unified strength of polarity of each lexical entry based on the Pearson correlation coefficient which measures how correlated lexical entries are with a value between 1 and - 1 , where 1 indicates that the lexical entries are perfectly correlated, 0 indicates no correlation, and -1 means they are perfectly inversely correlated and so is the UnifiedMetrics procedure for CPU and GPU, respectively. Another problem is the high processing time required for computing all the lexical entries in the unification task. Thus, the USL approach computes a subset of lexical entries in each of the 1344 GPU cores and uses parallel processing in order to unify 155,802 lexical entries. The results of the analysis conducted using the USL approach show that the USL has 95,430 lexical entries, out of which there are 35,201 considered to be positive, 22,029 negative, and 38,200 neutral. Finally, the runtime was 2.505 seconds for 95,430 lexical entries; this allows a reduction of the time computing for the UnifiedMetrics by 3 times with respect to the sequential implementation. A key contribution of this work is that we preserve the use of a unified sentiment lexicon for all tasks. Such lexicon is used to define resources and resource-related properties that can be verified based on the results of the analysis and is powerful, general and extensible enough to express a large class of interesting properties. Some applications of this work include merging, aligning, pruning and extending the current sentiment lexicons.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Postprint

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Postprint