892 resultados para Parallel computing, Virtual machine, Composition, Determinism, Abstraction
Resumo:
Modeling the evolution of the state of program memory during program execution is critical to many parallehzation techniques. Current memory analysis techniques either provide very accurate information but run prohibitively slowly or produce very conservative results. An approach based on abstract interpretation is presented for analyzing programs at compile time, which can accurately determine many important program properties such as aliasing, logical data structures and shape. These properties are known to be critical for transforming a single threaded program into a versión that can be run on múltiple execution units in parallel. The analysis is shown to be of polynomial complexity in the size of the memory heap. Experimental results for benchmarks in the Jolden suite are given. These results show that in practice the analysis method is efflcient and is capable of accurately determining shape information in programs that créate and manipúlate complex data structures.
Resumo:
En los últimos años, debido al notable desarrollo de los terminales portátiles, que han pasado de ser “simples” teléfonos o reproductores a puros ordenadores, ha crecido el número de servicios que ofrecen cada vez mayor cantidad de contenido multimedia a través de internet. Además, la distinta evolución de estos terminales hace que nos encontremos en el mercado con una amplísima gama de productos de diferentes tamaños y capacidades de procesamiento, lo que hace necesario encontrar una fórmula que permita satisfacer la demanda de dichos servicios sea cual sea la naturaleza de nuestro dispositivo. Para poder ofrecer una solución adecuada se ha optado por la integración de un protocolo como RTP y un estándar de video como SVC. RTP (Real-time Transport Protocol), en contraposición a los protocolos de propósito general fue diseñado para aplicaciones de tiempo real por lo que es ideal para el streaming de contenido multimedia. Por su parte, SVC es un estándar de video escalable que permite transmitir en un mismo stream una capa base y múltiples capas de mejora, por lo que podremos adaptar la calidad y tamaño del contenido a la capacidad y tamaño de nuestro dispositivo. El objetivo de este proyecto consiste en integrar y modificar tanto el reproductor MPlayer como la librería RTP live555 de tal forma que sean capaces de soportar el formato SVC sobre el protocolo RTP y montar un sistema servidorcliente para comprobar su funcionamiento. Aunque este proceso esté orientado a llevarse a cabo en un dispositivo móvil, para este proyecto se ha optado por realizarlo en el escenario más sencillo posible, para lo cual, se emitirán secuencias a una máquina virtual alojada en el mismo ordenador que el servidor. ABSTRACT In recent years, due to the remarkable development of mobile devices, which have evolved from "simple" phones or players to computers, the amount of services that offer multimedia content over the internet have shot up. Furthermore, the different evolution of these terminals causes that we can find in the market a wide range of different sizes and processing capabilities, making necessary to find a formula that will satisfy the demand for such services regardless of the nature of our device. In order to provide a suitable solution we have chosen to integrate a protocol as RTP and a video standard as SVC. RTP (Real-time Transport Protocol), in opposition to general purpose protocols was designed for real-time applications making it ideal for media streaming. Meanwhile, SVC is a scalable video standard which can transmit a single stream in a base layer and multiple enhancement layers, so that we can adapt the quality and size of the content to the capacity and size of our device. The objective of this project is to integrate and modify both MPlayer and RTP library live555 so that they support the SVC format over RTP protocol and set up a client-server system to check its behavior. Although this process has been designed to be done on a mobile device, for this project we have chosen to do it in the simplest possible scenario so we will stream to a virtual machine hosted on the same computer where we have the server.
Resumo:
Virtualization techniques have received increased attention in the field of embedded real-time systems. Such techniques provide a set of virtual machines that run on a single hardware platform, thus allowing several application programs to be executed as though they were running on separate machines, with isolated memory spaces and a fraction of the real processor time available to each of them.This papers deals with some problems that arise when implementing real-time systems written in Ada on a virtual machine. The effects of virtualization on the performance of the Ada real-time services are analysed, and requirements for the virtualization layer are derived. Virtual-machine time services are also defined in order to properly support Ada real-time applications. The implementation of the ORK+ kernel on the XtratuM supervisor is used as an example.
Resumo:
Tissue P systems generalize the membrane structure tree usual in original models of P systems to an arbitrary graph. Basic opera- tions in these systems are communication rules, enriched in some variants with cell division or cell separation. Several variants of tissue P systems were recently studied, together with the concept of uniform families of these systems. Their computational power was shown to range between P and NP ? co-NP , thus characterizing some interesting borderlines between tractability and intractability. In this paper we show that com- putational power of these uniform families in polynomial time is limited by the class PSPACE . This class characterizes the power of many clas- sical parallel computing models
Resumo:
Los avances que se han producido en los últimos años en cuanto a potencia y capacidades de los teléfonos móviles que usamos de manera cotidiana, traen de la mano un auge en la demanda de aplicaciones de todo ámbito: desde aplicaciones generales de consumo, pasando por juegos, hasta aplicaciones que ofrecen soluciones internas a empresas. Existen diferentes sistemas operativos para teléfonos móviles como se explicará más adelante en el capítulo introductorio. En dicho capítulo se da la justificación de por qué en el presente Proyecto Fin de Carrera se centra en el estudio del sistema operativo Android. Primeramente se dará una visión global del estado del arte en cuanto al mundo de aplicaciones móviles se refiere. Se explicarán los pros y contras de cada sistema operativo, detallando el lenguaje de programación utilizado en cada uno de ellos y sus principales características. Después, en el capítulo tres se estudiará con más profundidad el sistema operativo Android, desde su historia y orígenes, hasta los componentes básicos para la creación de una aplicación, pasando por la arquitectura interna del sistema o su máquina virtual. Con esto se pretende que el lector tenga un contexto que le permita comprender los siguientes capítulos, que es donde está el núcleo de este Proyecto Fin de Carrera. El cuarto capítulo trata de una serie de prácticas incrementales, que cubren una gran parte de las posibilidades que ofrece el sistema operativo Android para el desarrollo de aplicaciones. Se ha pretendido que la dificultad vaya de menos a más y que las prácticas se vayan apoyando en las anteriores, para tener al final una única solución que englobe todas las lecciones. El último capítulo quiere englobar el uso de todas las lecciones aprendidas en las lecciones anteriores para crear una aplicación que bien podría ser una aplicación real para un cliente. Se trata de una aplicación que muestra en tiempo real información sobre las cámaras de tráfico de la ciudad de Madrid. ABSTRACT. The improvements that have occurred in recent years in terms of power and capabilities of mobile phones that we use on a daily basis, bring an increment in demand for all kind of applications, from general consumer applications, games or even internal applications that offer solutions to companies. There are different operating systems for mobile phones as will be explained later in the introductory chapter. In that chapter the answer for why this Thesis focuses on the study of the Android operating system is given as well. First an overview of the state of the art about the world of mobile applications will be referred. The pros and cons of each operating system will be explained, detailing the programming language used in each of them and their main characteristics. Then in chapter three will be discussed in more depth the Android operating system, from its history and beginnings to the main components for the creation of an application, to the internal architecture of the system or virtual machine. The goal of chapter three is to give the readers a context that allows them to understand the following chapters, where the core of this Thesis is. The fourth chapter contains a series of incremental practices covering a large part of the potential of the Android operating system for application development. Those practices grow in difficulty and are supported by the previous in order to have at the end a single solution that fits all lessons. The last chapter wants to embrace the use of all the lessons learned in previous lessons to create an application that could well be an actual application for a client. It is an application that displays real-time information off traffic cameras of the city of Madrid.
Resumo:
El sistema operativo FreeBSD soporta distintos modos de virtualización sobre la plataforma Xen. Cada uno usa una técnicas de virtualización distinta, logrando mayor o menor integración con el hipervisor. Actualmente, están soportados en FreeBSD el modo paravirtualizado, virtualizado asistido por hardware y modos híbridos. Este trabajo consiste fundamentalmente en un estudio práctico de los distintos modos de virtualización Xen soportados en FreeBSD, basándose en pruebas de sintéticas de rendimiento. Se incluye una comparativa con gráficas de los resultados obtenidos mediante un sistema de pruebas automáticas desarrollado en shell script y R. ABSTRACT. The FreeBSD operative system supports several virtualization modes when used over the Xen platform. Each mode uses a different virtualization technique, achieving different level of integration with the hypervisor. Current supported modes on FreeBSD are paravirtualized mode, hardware virtualization assisted and hybrid modes. This work is a survey on FreeBSD virtualization over Xen, focused on performance by benchmark testing all supported virtual machine implementations. The study includes a comparative of the measured test results performed by an automatic testing tool developed on shell and R script.
Resumo:
PAMELA (Phased Array Monitoring for Enhanced Life Assessment) SHMTM System is an integrated embedded ultrasonic guided waves based system consisting of several electronic devices and one system manager controller. The data collected by all PAMELA devices in the system must be transmitted to the controller, who will be responsible for carrying out the advanced signal processing to obtain SHM maps. PAMELA devices consist of hardware based on a Virtex 5 FPGA with a PowerPC 440 running an embedded Linux distribution. Therefore, PAMELA devices, in addition to the capability of performing tests and transmitting the collected data to the controller, have the capability of perform local data processing or pre-processing (reduction, normalization, pattern recognition, feature extraction, etc.). Local data processing decreases the data traffic over the network and allows CPU load of the external computer to be reduced. Even it is possible that PAMELA devices are running autonomously performing scheduled tests, and only communicates with the controller in case of detection of structural damages or when programmed. Each PAMELA device integrates a software management application (SMA) that allows to the developer downloading his own algorithm code and adding the new data processing algorithm to the device. The development of the SMA is done in a virtual machine with an Ubuntu Linux distribution including all necessary software tools to perform the entire cycle of development. Eclipse IDE (Integrated Development Environment) is used to develop the SMA project and to write the code of each data processing algorithm. This paper presents the developed software architecture and describes the necessary steps to add new data processing algorithms to SMA in order to increase the processing capabilities of PAMELA devices.An example of basic damage index estimation using delay and sum algorithm is provided.
Resumo:
El tiempo de concentración de una cuenca sigue siendo relativamente desconocido para los ingenieros. El procedimiento habitual en un estudio hidrológico es calcularlo según varias fórmulas escogidas entre las existentes para después emplear el valor medio obtenido. De esta media se derivan los demás resultados hidrológicos, resultados que influirán en el futuro dimensionamiento de las infraestructuras. Este trabajo de investigación comenzó con el deseo de conseguir un método más fiable y objetivo que permitiera obtener el tiempo de concentración. Dada la imposibilidad de poner en práctica ensayos hidrológicos en una cuenca física real, ya que no resulta viable monitorizar perfectamente la precipitación ni los caudales de salida, se planteó llevar a cabo los ensayos de forma simulada, con el empleo de modelos hidráulicos bidimensionales de lluvia directa sobre malla 2D de volúmenes finitos. De entre todos los disponibles, se escogió InfoWorks ICM, por su rapidez y facilidad de uso. En una primera fase se efectuó la validación del modelo hidráulico elegido, contrastando los resultados de varias simulaciones con la formulación analítica existente. Posteriormente, se comprobaron los valores de los tiempos de concentración obtenidos con las expresiones referenciadas en la bibliografía, consiguiéndose resultados muy satisfactorios. Una vez verificado, se ejecutaron 690 simulaciones de cuencas tanto naturales como sintéticas, incorporando variaciones de área, pendiente, rugosidad, intensidad y duración de las precipitaciones, a fin de obtener sus tiempos de concentración y retardo. Esta labor se realizó con ayuda de la aceleración del cálculo vectorial que ofrece la tecnología CUDA (Arquitectura Unificada de Dispositivos de Cálculo). Basándose en el análisis dimensional, se agruparon los resultados del tiempo de concentración en monomios adimensionales. Utilizando regresión lineal múltiple, se obtuvo una nueva formulación para el tiempo de concentración. La nueva expresión se contrastó con las formulaciones clásicas, habiéndose obtenido resultados equivalentes. Con la exposición de esta nueva metodología se pretende ayudar al ingeniero en la realización de estudios hidrológicos. Primero porque proporciona datos de manera sencilla y objetiva que se pueden emplear en modelos globales como HEC-HMS. Y segundo porque en sí misma se ha comprobado como una alternativa realmente válida a la metodología hidrológica habitual. Time of concentration remains still fairly imprecise to engineers. A normal hydrological study goes through several formulae, obtaining concentration time as the median value. Most of the remaining hydrologic results will be derived from this value. Those results will determine how future infrastructures will be designed. This research began with the aim to acquire a more reliable and objective method to estimate concentration times. Given the impossibility of carrying out hydrological tests in a real watershed, due to the difficulties related to accurate monitoring of rainfall and derived outflows, a model-based approach was proposed using bidimensional hydraulic simulations of direct rainfall over a 2D finite-volume mesh. Amongst all of the available software packages, InfoWorks ICM was chosen for its speed and ease of use. As a preliminary phase, the selected hydraulic model was validated, checking the outcomes of several simulations over existing analytical formulae. Next, concentration time values were compared to those resulting from expressions referenced in the technical literature. They proved highly satisfactory. Once the model was properly verified, 690 simulations of both natural and synthetic basins were performed, incorporating variations of area, slope, roughness, intensity and duration of rainfall, in order to obtain their concentration and lag times. This job was carried out in a reasonable time lapse with the aid of the parallel computing platform technology CUDA (Compute Unified Device Architecture). Performing dimensional analysis, concentration time results were isolated in dimensionless monomials. Afterwards, a new formulation for the time of concentration was obtained using multiple linear regression. This new expression was checked against classical formulations, obtaining equivalent results. The publication of this new methodology intends to further assist the engineer while carrying out hydrological studies. It is effective to provide global parameters that will feed global models as HEC-HMS on a simple and objective way. It has also been proven as a solid alternative to usual hydrology methodology.
Resumo:
Este proyecto fin de carrera tiene como finalidad el diseño y la implementación de un sistema de monitorización y gestión dinámica de redes de sensores y actuadores inalámbricos (Wireless Sensor and Actuator Networks – WSAN) en base a la información de configuración almacenada en una base de datos sobre la cual un motor de detección vigila posibles cambios. Este motor informará de los cambios a la herramienta de gestión y monitorización de la WSAN para que sean llevados a cabo en la red desplegada. Este trabajo se enmarca en otro más amplio cuya finalidad es la demostración de la posibilidad de reconfigurar dinámicamente una WSAN utilizando los mecanismos propios de las Líneas de Productos Software Dinámicos (DSPL, por sus siglas en inglés). Se ha diseñado e implementado el software que proporciona los métodos necesarios para la comunicación y actuación sobre la red de sensores y actuadores inalámbricos, además de permitir el control de cada uno de los dispositivos pertenecientes a dicha red y que los dispositivos se incorporen a dicha red de manera autónoma. El desarrollo y pruebas de este proyecto fin de carrera se ha realizado utilizando una máquina virtual sobre la que se ha configurado convenientemente una plataforma que incluye un emulador de red de sensores y actuadores de tecnología SunSpot (Solarium) y todas las herramientas de desarrollo y ejecución necesarias (entre ellas, SunSpot SDK 6.0 y NetBeans). Esta máquina virtual ejecuta un sistema operativo Unix (Ubuntu Server 12.4) y facilita el rápido despliegue de las herramientas implementadas así como la integración de las mismas en desarrollos más amplios. En esta memoria se describe todo el proceso de diseño e implementación del software desarrollado, las conclusiones obtenidas de su ejecución y una guía de usuario para su despliegue y manejo. ABSTRACT. The aim of this project is the design and implementation of a system to monitor and dynamically manage a wireless sensor and actuator network (WSAN) in consistence with the configuration information stored in a database whose changes are monitored by a so-called monitoring engine. This engine informs the management and monitoring tool about the changes, in order for these to be carried out on the deployed network. This project is a part of a broader one aimed at demonstrating the ability to dynamically reconfigure a WSAN using the mechanisms of the Dynamic Software Product Lines (DSPL). A software has been designed and implemented which provides the methods to communicate with and actuate on the WSAN. It also allows to control each of the devices, as well as their autonomous incorporation to the network. Development and testing of this project was done using a virtual machine that has a conveniently configured platform which includes a SunSpot technology WSAN emulator (Solarium) as well as all the necessary development and implementation tools (including SunSpot 6.0 SDK and NetBeans). This virtual machine runs a Unix (Ubuntu Server 12.4) operating system and makes it easy to rapidly deploy the implemented tools and to integrate them into broader developments. This document explains the whole process of designing and implementing the software, the conclusions of execution and a user's manual.
Resumo:
A simple evolutionary process can discover sophisticated methods for emergent information processing in decentralized spatially extended systems. The mechanisms underlying the resulting emergent computation are explicated by a technique for analyzing particle-based logic embedded in pattern-forming systems. Understanding how globally coordinated computation can emerge in evolution is relevant both for the scientific understanding of natural information processing and for engineering new forms of parallel computing systems.
Resumo:
Devido às tendências de crescimento da quantidade de dados processados e a crescente necessidade por computação de alto desempenho, mudanças significativas estão acontecendo no projeto de arquiteturas de computadores. Com isso, tem-se migrado do paradigma sequencial para o paralelo, com centenas ou milhares de núcleos de processamento em um mesmo chip. Dentro desse contexto, o gerenciamento de energia torna-se cada vez mais importante, principalmente em sistemas embarcados, que geralmente são alimentados por baterias. De acordo com a Lei de Moore, o desempenho de um processador dobra a cada 18 meses, porém a capacidade das baterias dobra somente a cada 10 anos. Esta situação provoca uma enorme lacuna, que pode ser amenizada com a utilização de arquiteturas multi-cores heterogêneas. Um desafio fundamental que permanece em aberto para estas arquiteturas é realizar a integração entre desenvolvimento de código embarcado, escalonamento e hardware para gerenciamento de energia. O objetivo geral deste trabalho de doutorado é investigar técnicas para otimização da relação desempenho/consumo de energia em arquiteturas multi-cores heterogêneas single-ISA implementadas em FPGA. Nesse sentido, buscou-se por soluções que obtivessem o melhor desempenho possível a um consumo de energia ótimo. Isto foi feito por meio da combinação de mineração de dados para a análise de softwares baseados em threads aliadas às técnicas tradicionais para gerenciamento de energia, como way-shutdown dinâmico, e uma nova política de escalonamento heterogeneity-aware. Como principais contribuições pode-se citar a combinação de técnicas de gerenciamento de energia em diversos níveis como o nível do hardware, do escalonamento e da compilação; e uma política de escalonamento integrada com uma arquitetura multi-core heterogênea em relação ao tamanho da memória cache L1.
Resumo:
A computação paralela permite uma série de vantagens para a execução de aplicações de grande porte, sendo que o uso efetivo dos recursos computacionais paralelos é um aspecto relevante da computação de alto desempenho. Este trabalho apresenta uma metodologia que provê a execução, de forma automatizada, de aplicações paralelas baseadas no modelo BSP com tarefas heterogêneas. É considerado no modelo adotado, que o tempo de computação de cada tarefa secundária não possui uma alta variância entre uma iteração e outra. A metodologia é denominada de ASE e é composta por três etapas: Aquisição (Acquisition), Escalonamento (Scheduling) e Execução (Execution). Na etapa de Aquisição, os tempos de processamento das tarefas são obtidos; na etapa de Escalonamento a metodologia busca encontrar a distribuição de tarefas que maximize a velocidade de execução da aplicação paralela, mas minimizando o uso de recursos, por meio de um algoritmo desenvolvido neste trabalho; e por fim a etapa de Execução executa a aplicação paralela com a distribuição definida na etapa anterior. Ferramentas que são aplicadas na metodologia foram implementadas. Um conjunto de testes aplicando a metodologia foi realizado e os resultados apresentados mostram que os objetivos da proposta foram alcançados.
Resumo:
Self-organising neural models have the ability to provide a good representation of the input space. In particular the Growing Neural Gas (GNG) is a suitable model because of its flexibility, rapid adaptation and excellent quality of representation. However, this type of learning is time-consuming, especially for high-dimensional input data. Since real applications often work under time constraints, it is necessary to adapt the learning process in order to complete it in a predefined time. This paper proposes a Graphics Processing Unit (GPU) parallel implementation of the GNG with Compute Unified Device Architecture (CUDA). In contrast to existing algorithms, the proposed GPU implementation allows the acceleration of the learning process keeping a good quality of representation. Comparative experiments using iterative, parallel and hybrid implementations are carried out to demonstrate the effectiveness of CUDA implementation. The results show that GNG learning with the proposed implementation achieves a speed-up of 6× compared with the single-threaded CPU implementation. GPU implementation has also been applied to a real application with time constraints: acceleration of 3D scene reconstruction for egomotion, in order to validate the proposal.
Resumo:
A parallel algorithm to remove impulsive noise in digital images using heterogeneous CPU/GPU computing is proposed. The parallel denoising algorithm is based on the peer group concept and uses an Euclidean metric. In order to identify the amount of pixels to be allocated in multi-core and GPUs, a performance analysis using large images is presented. A comparison of the parallel implementation in multi-core, GPUs and a combination of both is performed. Performance has been evaluated in terms of execution time and Megapixels/second. We present several optimization strategies especially effective for the multi-core environment, and demonstrate significant performance improvements. The main advantage of the proposed noise removal methodology is its computational speed, which enables efficient filtering of color images in real-time applications.
Resumo:
Tool path generation is one of the most complex problems in Computer Aided Manufacturing. Although some efficient strategies have been developed, most of them are only useful for standard machining. However, the algorithms used for tool path computation demand a higher computation performance, which makes the implementation on many existing systems very slow or even impractical. Hardware acceleration is an incremental solution that can be cleanly added to these systems while keeping everything else intact. It is completely transparent to the user. The cost is much lower and the development time is much shorter than replacing the computers by faster ones. This paper presents an optimisation that uses a specific graphic hardware approach using the power of multi-core Graphic Processing Units (GPUs) in order to improve the tool path computation. This improvement is applied on a highly accurate and robust tool path generation algorithm. The paper presents, as a case of study, a fully implemented algorithm used for turning lathe machining of shoe lasts. A comparative study will show the gain achieved in terms of total computing time. The execution time is almost two orders of magnitude faster than modern PCs.