892 resultados para Parallel computing, Virtual machine, Composition, Determinism, Abstraction
Resumo:
Sao Paulo State Research Foundation-FAPESP
Resumo:
Pós-graduação em Ciência da Computação - IBILCE
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
This paper presents a new parallel methodology for calculating the determinant of matrices of the order n, with computational complexity O(n), using the Gauss-Jordan Elimination Method and Chio's Rule as references. We intend to present our step-by-step methodology using clear mathematical language, where we will demonstrate how to calculate the determinant of a matrix of the order n in an analytical format. We will also present a computational model with one sequential algorithm and one parallel algorithm using a pseudo-code.
Resumo:
The scale down of transistor technology allows microelectronics manufacturers such as Intel and IBM to build always more sophisticated systems on a single microchip. The classical interconnection solutions based on shared buses or direct connections between the modules of the chip are becoming obsolete as they struggle to sustain the increasing tight bandwidth and latency constraints that these systems demand. The most promising solution for the future chip interconnects are the Networks on Chip (NoC). NoCs are network composed by routers and channels used to inter- connect the different components installed on the single microchip. Examples of advanced processors based on NoC interconnects are the IBM Cell processor, composed by eight CPUs that is installed on the Sony Playstation III and the Intel Teraflops pro ject composed by 80 independent (simple) microprocessors. On chip integration is becoming popular not only in the Chip Multi Processor (CMP) research area but also in the wider and more heterogeneous world of Systems on Chip (SoC). SoC comprehend all the electronic devices that surround us such as cell-phones, smart-phones, house embedded systems, automotive systems, set-top boxes etc... SoC manufacturers such as ST Microelectronics , Samsung, Philips and also Universities such as Bologna University, M.I.T., Berkeley and more are all proposing proprietary frameworks based on NoC interconnects. These frameworks help engineers in the switch of design methodology and speed up the development of new NoC-based systems on chip. In this Thesis we propose an introduction of CMP and SoC interconnection networks. Then focusing on SoC systems we propose: • a detailed analysis based on simulation of the Spidergon NoC, a ST Microelectronics solution for SoC interconnects. The Spidergon NoC differs from many classical solutions inherited from the parallel computing world. Here we propose a detailed analysis of this NoC topology and routing algorithms. Furthermore we propose aEqualized a new routing algorithm designed to optimize the use of the resources of the network while also increasing its performance; • a methodology flow based on modified publicly available tools that combined can be used to design, model and analyze any kind of System on Chip; • a detailed analysis of a ST Microelectronics-proprietary transport-level protocol that the author of this Thesis helped developing; • a simulation-based comprehensive comparison of different network interface designs proposed by the author and the researchers at AST lab, in order to integrate shared-memory and message-passing based components on a single System on Chip; • a powerful and flexible solution to address the time closure exception issue in the design of synchronous Networks on Chip. Our solution is based on relay stations repeaters and allows to reduce the power and area demands of NoC interconnects while also reducing its buffer needs; • a solution to simplify the design of the NoC by also increasing their performance and reducing their power and area consumption. We propose to replace complex and slow virtual channel-based routers with multiple and flexible small Multi Plane ones. This solution allows us to reduce the area and power dissipation of any NoC while also increasing its performance especially when the resources are reduced. This Thesis has been written in collaboration with the Advanced System Technology laboratory in Grenoble France, and the Computer Science Department at Columbia University in the city of New York.
Resumo:
Complex networks analysis is a very popular topic in computer science. Unfortunately this networks, extracted from different contexts, are usually very large and the analysis may be very complicated: computation of metrics on these structures could be very complex. Among all metrics we analyse the extraction of subnetworks called communities: they are groups of nodes that probably play the same role within the whole structure. Communities extraction is an interesting operation in many different fields (biology, economics,...). In this work we present a parallel community detection algorithm that can operate on networks with huge number of nodes and edges. After an introduction to graph theory and high performance computing, we will explain our design strategies and our implementation. Then, we will show some performance evaluation made on a distributed memory architectures i.e. the supercomputer IBM-BlueGene/Q "Fermi" at the CINECA supercomputing center, Italy, and we will comment our results.
Resumo:
During the last few decades an unprecedented technological growth has been at the center of the embedded systems design paramount, with Moore’s Law being the leading factor of this trend. Today in fact an ever increasing number of cores can be integrated on the same die, marking the transition from state-of-the-art multi-core chips to the new many-core design paradigm. Despite the extraordinarily high computing power, the complexity of many-core chips opens the door to several challenges. As a result of the increased silicon density of modern Systems-on-a-Chip (SoC), the design space exploration needed to find the best design has exploded and hardware designers are in fact facing the problem of a huge design space. Virtual Platforms have always been used to enable hardware-software co-design, but today they are facing with the huge complexity of both hardware and software systems. In this thesis two different research works on Virtual Platforms are presented: the first one is intended for the hardware developer, to easily allow complex cycle accurate simulations of many-core SoCs. The second work exploits the parallel computing power of off-the-shelf General Purpose Graphics Processing Units (GPGPUs), with the goal of an increased simulation speed. The term Virtualization can be used in the context of many-core systems not only to refer to the aforementioned hardware emulation tools (Virtual Platforms), but also for two other main purposes: 1) to help the programmer to achieve the maximum possible performance of an application, by hiding the complexity of the underlying hardware. 2) to efficiently exploit the high parallel hardware of many-core chips in environments with multiple active Virtual Machines. This thesis is focused on virtualization techniques with the goal to mitigate, and overtake when possible, some of the challenges introduced by the many-core design paradigm.
Resumo:
After decades of development in programming languages and programming environments, Smalltalk is still one of few environments that provide advanced features and is still widely used in the industry. However, as Java became prevalent, the ability to call Java code from Smalltalk and vice versa becomes important. Traditional approaches to integrate the Java and Smalltalk languages are through low-level communication between separate Java and Smalltalk virtual machines. We are not aware of any attempt to execute and integrate the Java language directly in the Smalltalk environment. A direct integration allows for very tight and almost seamless integration of the languages and their objects within a single environment. Yet integration and language interoperability impose challenging issues related to method naming conventions, method overloading, exception handling and thread-locking mechanisms. In this paper we describe ways to overcome these challenges and to integrate Java into the Smalltalk environment. Using techniques described in this paper, the programmer can call Java code from Smalltalk using standard Smalltalk idioms while the semantics of each language remains preserved. We present STX:LIBJAVA - an implementation of Java virtual machine within Smalltalk/X - as a validation of our approach
Resumo:
The past decade has seen the energy consumption in servers and Internet Data Centers (IDCs) skyrocket. A recent survey estimated that the worldwide spending on servers and cooling have risen to above $30 billion and is likely to exceed spending on the new server hardware . The rapid rise in energy consumption has posted a serious threat to both energy resources and the environment, which makes green computing not only worthwhile but also necessary. This dissertation intends to tackle the challenges of both reducing the energy consumption of server systems and by reducing the cost for Online Service Providers (OSPs). Two distinct subsystems account for most of IDC’s power: the server system, which accounts for 56% of the total power consumption of an IDC, and the cooling and humidifcation systems, which accounts for about 30% of the total power consumption. The server system dominates the energy consumption of an IDC, and its power draw can vary drastically with data center utilization. In this dissertation, we propose three models to achieve energy effciency in web server clusters: an energy proportional model, an optimal server allocation and frequency adjustment strategy, and a constrained Markov model. The proposed models have combined Dynamic Voltage/Frequency Scaling (DV/FS) and Vary-On, Vary-off (VOVF) mechanisms that work together for more energy savings. Meanwhile, corresponding strategies are proposed to deal with the transition overheads. We further extend server energy management to the IDC’s costs management, helping the OSPs to conserve, manage their own electricity cost, and lower the carbon emissions. We have developed an optimal energy-aware load dispatching strategy that periodically maps more requests to the locations with lower electricity prices. A carbon emission limit is placed, and the volatility of the carbon offset market is also considered. Two energy effcient strategies are applied to the server system and the cooling system respectively. With the rapid development of cloud services, we also carry out research to reduce the server energy in cloud computing environments. In this work, we propose a new live virtual machine (VM) placement scheme that can effectively map VMs to Physical Machines (PMs) with substantial energy savings in a heterogeneous server cluster. A VM/PM mapping probability matrix is constructed, in which each VM request is assigned with a probability running on PMs. The VM/PM mapping probability matrix takes into account resource limitations, VM operation overheads, server reliability as well as energy effciency. The evolution of Internet Data Centers and the increasing demands of web services raise great challenges to improve the energy effciency of IDCs. We also express several potential areas for future research in each chapter.
Resumo:
Cost-efficient operation while satisfying performance and availability guarantees in Service Level Agreements (SLAs) is a challenge for Cloud Computing, as these are potentially conflicting objectives. We present a framework for SLA management based on multi-objective optimization. The framework features a forecasting model for determining the best virtual machine-to-host allocation given the need to minimize SLA violations, energy consumption and resource wasting. A comprehensive SLA management solution is proposed that uses event processing for monitoring and enables dynamic provisioning of virtual machines onto the physical infrastructure. We validated our implementation against serveral standard heuristics and were able to show that our approach is significantly better.
Resumo:
Context. Direct observations of gaseous exoplanets reveal that their gas envelope has a higher C/O ratio than that of the host star (e.g., Wasp 12-b). This has been explained by considering that the gas phase of the disc could be inhomogeneous, exceeding the stellar C/O ratio in regions where these planets formed; but few studies have considered the drift of the gas and planet migration. Aims. We aim to derive the gas composition in planets through planet formation to evaluate if the formation of giant planets with an enriched C/O ratio is possible. The study focusses on the effects of different processes on the C/O ratio, such as the disc evolution, the drift of gas, and planet migration. Methods. We used our previous models for computing the chemical composition, together with a planet formation model, to which we added the composition and drift of the gas phase of the disc, which is composed of the main volatile species H2O, CO, CO2, NH3, N2, CH3OH, CH4, and H2S, H2 and He. The study focusses on the region where ice lines are present and influence the C/O ratio of the planets. Results. Modelling shows that the condensation of volatile species as a function of radial distance allows for C/O enrichment in specific parts of the protoplanetary disc of up to four times the solar value. This leads to the formation of planets that can be enriched in C/O in their envelope up to three times the solar value. Planet migration, gas phase evolution and disc irradiation enables the evolution of the initial C/O ratio that decreases in the outer part of the disc and increases in the inner part of the disc. The total C/O ratio of the planets is governed by the contribution of ices accreted, suggesting that high C/O ratios measured in planetary atmospheres are indicative of a lack of exchange of material between the core of a planet and its envelope or an observational bias. It also suggests that the observed C/O ratio is not representative of the total C/O ratio of the planet.
Resumo:
We present a parallel graph narrowing machine, which is used to implement a functional logic language on a shared memory multiprocessor. It is an extensión of an abstract machine for a purely functional language. The result is a programmed graph reduction machine which integrates the mechanisms of unification, backtracking, and independent and-parallelism. In the machine, the subexpressions of an expression can run in parallel. In the case of backtracking, the structure of an expression is used to avoid the reevaluation of subexpressions as far as possible. Deterministic computations are detected. Their results are maintained and need not be reevaluated after backtracking.
Resumo:
El mundo tecnológico está cambiando hacia la optimización en la gestión de recursos gracias a la poderosa influencia de tecnologías como la virtualización y la computación en la nube (Cloud Computing). En esta memoria se realiza un acercamiento a las mismas, desde las causas que las motivaron hasta sus últimas tendencias, pasando por la identificación de sus principales características, ventajas e inconvenientes. Por otro lado, el Hogar Digital es ya una realidad para la mayoría de los seres humanos. En él se dispone de acceso a múltiples tipos de redes de telecomunicaciones (3G, 4G, WI-FI, ADSL…) con más o menos capacidad pero que permiten conexiones a internet desde cualquier parte, en todo momento, y con prácticamente cualquier dispositivo (ordenadores personales, smartphones, tabletas, televisores…). Esto es aprovechado por las empresas para ofrecer todo tipo de servicios. Algunos de estos servicios están basados en el cloud computing sobre todo ofreciendo almacenamiento en la nube a aquellos dispositivos con capacidad reducida, como son los smarthphones y las tabletas. Ese espacio de almacenamiento normalmente está en los servidores bajo el control de grandes compañías. Guardar documentos, videos, fotos privadas sin tener la certeza de que estos no son consultados por alguien sin consentimiento, puede despertar en el usuario cierto recelo. Para estos usuarios que desean control sobre su intimidad, se ofrece la posibilidad de que sea el propio usuario el que monte sus propios servidores y su propio servicio cloud para compartir su información privada sólo con sus familiares y amigos o con cualquiera al que le dé permiso. Durante el proyecto se han comparado diversas soluciones, la mayoría de código abierto y de libre distribución, que permiten desplegar como mínimo un servicio de almacenamiento accesible a través de Internet. Algunas de ellas lo complementan con servicios de streaming tanto de música como de videos, compartición y sincronización de documentos entre múltiples dispositivos, calendarios, copias de respaldo (backups), virtualización de escritorios, versionado de ficheros, chats, etc. El proyecto finaliza con una demostración de cómo utilizar dispositivos de un hogar digital interactuando con un servidor Cloud, en el que previamente se ha instalado y configurado una de las soluciones comparadas. Este servidor quedará empaquetado en una máquina virtual para que sea fácilmente transportable e utilizable. ABSTRACT. The technological world is changing towards optimizing resource management thanks to the powerful influence of technologies such as Virtualization and Cloud Computing. This document presents a closer approach to them, from the causes that have motivated to their last trends, as well as showing their main features, advantages and disadvantages. In addition, the Digital Home is a reality for most humans. It provides access to multiple types of telecommunication networks (3G, 4G, WI-FI, ADSL...) with more or less capacity, allowing Internet connections from anywhere, at any time, and with virtually any device (computer personal smartphones, tablets, televisions...).This is used by companies to provide all kinds of services. Some of these services offer storage on the cloud to devices with limited capacity, such as smartphones and tablets. That is normally storage space on servers under the control of important companies. Saving private documents, videos, photos, without being sure that they are not viewed by anyone without consent, can wake up suspicions in some users. For those users who want control over their privacy, it offers the possibility that it is the user himself to mount his own server and its own cloud service to share private information only with family and friends or with anyone with consent. During the project I have compared different solutions, most open source and with GNU licenses, for deploying one storage facility accessible via the Internet. Some supplement include streaming services of music , videos or photos, sharing and syncing documents across multiple devices, calendars, backups, desktop virtualization, file versioning, chats... The project ends with a demonstration of how to use our digital home devices interacting with a cloud server where one of the solutions compared is installed and configured. This server will be packaged in a virtual machine to be easily transportable and usable.
Resumo:
Debido al gran incremento de datos digitales que ha tenido lugar en los últimos años, ha surgido un nuevo paradigma de computación paralela para el procesamiento eficiente de grandes volúmenes de datos. Muchos de los sistemas basados en este paradigma, también llamados sistemas de computación intensiva de datos, siguen el modelo de programación de Google MapReduce. La principal ventaja de los sistemas MapReduce es que se basan en la idea de enviar la computación donde residen los datos, tratando de proporcionar escalabilidad y eficiencia. En escenarios libres de fallo, estos sistemas generalmente logran buenos resultados. Sin embargo, la mayoría de escenarios donde se utilizan, se caracterizan por la existencia de fallos. Por tanto, estas plataformas suelen incorporar características de tolerancia a fallos y fiabilidad. Por otro lado, es reconocido que las mejoras en confiabilidad vienen asociadas a costes adicionales en recursos. Esto es razonable y los proveedores que ofrecen este tipo de infraestructuras son conscientes de ello. No obstante, no todos los enfoques proporcionan la misma solución de compromiso entre las capacidades de tolerancia a fallo (o de manera general, las capacidades de fiabilidad) y su coste. Esta tesis ha tratado la problemática de la coexistencia entre fiabilidad y eficiencia de los recursos en los sistemas basados en el paradigma MapReduce, a través de metodologías que introducen el mínimo coste, garantizando un nivel adecuado de fiabilidad. Para lograr esto, se ha propuesto: (i) la formalización de una abstracción de detección de fallos; (ii) una solución alternativa a los puntos únicos de fallo de estas plataformas, y, finalmente, (iii) un nuevo sistema de asignación de recursos basado en retroalimentación a nivel de contenedores. Estas contribuciones genéricas han sido evaluadas tomando como referencia la arquitectura Hadoop YARN, que, hoy en día, es la plataforma de referencia en la comunidad de los sistemas de computación intensiva de datos. En la tesis se demuestra cómo todas las contribuciones de la misma superan a Hadoop YARN tanto en fiabilidad como en eficiencia de los recursos utilizados. ABSTRACT Due to the increase of huge data volumes, a new parallel computing paradigm to process big data in an efficient way has arisen. Many of these systems, called dataintensive computing systems, follow the Google MapReduce programming model. The main advantage of these systems is based on the idea of sending the computation where the data resides, trying to provide scalability and efficiency. In failure-free scenarios, these frameworks usually achieve good results. However, these ones are not realistic scenarios. Consequently, these frameworks exhibit some fault tolerance and dependability techniques as built-in features. On the other hand, dependability improvements are known to imply additional resource costs. This is reasonable and providers offering these infrastructures are aware of this. Nevertheless, not all the approaches provide the same tradeoff between fault tolerant capabilities (or more generally, reliability capabilities) and cost. In this thesis, we have addressed the coexistence between reliability and resource efficiency in MapReduce-based systems, looking for methodologies that introduce the minimal cost and guarantee an appropriate level of reliability. In order to achieve this, we have proposed: (i) a formalization of a failure detector abstraction; (ii) an alternative solution to single points of failure of these frameworks, and finally (iii) a novel feedback-based resource allocation system at the container level. Finally, our generic contributions have been instantiated for the Hadoop YARN architecture, which is the state-of-the-art framework in the data-intensive computing systems community nowadays. The thesis demonstrates how all our approaches outperform Hadoop YARN in terms of reliability and resource efficiency.
Resumo:
Los resultados presentados en la memoria de esta tesis doctoral se enmarcan en la denominada computación celular con membranas una nueva rama de investigación dentro de la computación natural creada por Gh. Paun en 1998, de ahí que habitualmente reciba el nombre de sistemas P. Este nuevo modelo de cómputo distribuido está inspirado en la estructura y funcionamiento de la célula. El objetivo de esta tesis ha sido analizar el poder y la eficiencia computacional de estos sistemas de computación celular. En concreto, se han analizado dos tipos de sistemas P: por un lado los sistemas P de neuronas de impulsos, y por otro los sistemas P con proteínas en las membranas. Para el primer tipo, los resultados obtenidos demuestran que es posible que estos sistemas mantengan su universalidad aunque muchas de sus características se limiten o incluso se eliminen. Para el segundo tipo, se analiza la eficiencia computacional y se demuestra que son capaces de resolver problemas de la clase de complejidad ESPACIO-P (PSPACE) en tiempo polinómico. Análisis del poder computacional: Los sistemas P de neuronas de impulsos (en adelante SN P, acrónimo procedente del inglés «Spiking Neural P Systems») son sistemas inspirados en el funcionamiento neuronal y en la forma en la que los impulsos se propagan por las redes sinápticas. Los SN P bio-inpirados poseen un numeroso abanico de características que ha cen que dichos sistemas sean universales y por tanto equivalentes, en poder computacional, a una máquina de Turing. Estos sistemas son potentes a nivel computacional, pero tal y como se definen incorporan numerosas características, quizás demasiadas. En (Ibarra et al. 2007) se demostró que en estos sistemas sus funcionalidades podrían ser limitadas sin comprometer su universalidad. Los resultados presentados en esta memoria son continuistas con la línea de trabajo de (Ibarra et al. 2007) y aportan nuevas formas normales. Esto es, nuevas variantes simplificadas de los sistemas SN P con un conjunto mínimo de funcionalidades pero que mantienen su poder computacional universal. Análisis de la eficiencia computacional: En esta tesis se ha estudiado la eficiencia computacional de los denominados sistemas P con proteínas en las membranas. Se muestra que este modelo de cómputo es equivalente a las máquinas de acceso aleatorio paralelas (PRAM) o a las máquinas de Turing alterantes ya que se demuestra que un sistema P con proteínas, es capaz de resolver un problema ESPACIOP-Completo como el QSAT(problema de satisfacibilidad de fórmulas lógicas cuantificado) en tiempo polinómico. Esta variante de sistemas P con proteínas es muy eficiente gracias al poder de las proteínas a la hora de catalizar los procesos de comunicación intercelulares. ABSTRACT The results presented at this thesis belong to membrane computing a new research branch inside of Natural computing. This new branch was created by Gh. Paun on 1998, hence usually receives the name of P Systems. This new distributed computing model is inspired on structure and functioning of cell. The aim of this thesis is to analyze the efficiency and computational power of these computational cellular systems. Specifically there have been analyzed two different classes of P systems. On the one hand it has been analyzed the Neural Spiking P Systems, and on the other hand it has been analyzed the P systems with proteins on membranes. For the first class it is shown that it is possible to reduce or restrict the characteristics of these kind of systems without loss of computational power. For the second class it is analyzed the computational efficiency solving on polynomial time PSACE problems. Computational Power Analysis: The spiking neural P systems (SN P in short) are systems inspired by the way of neural cells operate sending spikes through the synaptic networks. The bio-inspired SN Ps possess a large range of features that make these systems to be universal and therefore equivalent in computational power to a Turing machine. Such systems are computationally powerful, but by definition they incorporate a lot of features, perhaps too much. In (Ibarra et al. in 2007) it was shown that their functionality may be limited without compromising its universality. The results presented herein continue the (Ibarra et al. 2007) line of work providing new formal forms. That is, new SN P simplified variants with a minimum set of functionalities but keeping the universal computational power. Computational Efficiency Analisys: In this thesis we study the computational efficiency of P systems with proteins on membranes. We show that this computational model is equivalent to parallel random access machine (PRAM) or alternating Turing machine because, we show P Systems with proteins can solve a PSPACE-Complete problem as QSAT (Quantified Propositional Satisfiability Problem) on polynomial time. This variant of P Systems with proteins is very efficient thanks to computational power of proteins to catalyze inter-cellular communication processes.