20 resultados para Hervé Bouchard
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Informática
Resumo:
Thesis submitted in fulfilment of the requirements for the Degree of Master of Science in Computer Science
Resumo:
The Graphics Processing Unit (GPU) is present in almost every modern day personal computer. Despite its specific purpose design, they have been increasingly used for general computations with very good results. Hence, there is a growing effort from the community to seamlessly integrate this kind of devices in everyday computing. However, to fully exploit the potential of a system comprising GPUs and CPUs, these devices should be presented to the programmer as a single platform. The efficient combination of the power of CPU and GPU devices is highly dependent on each device’s characteristics, resulting in platform specific applications that cannot be ported to different systems. Also, the most efficient work balance among devices is highly dependable on the computations to be performed and respective data sizes. In this work, we propose a solution for heterogeneous environments based on the abstraction level provided by algorithmic skeletons. Our goal is to take full advantage of the power of all CPU and GPU devices present in a system, without the need for different kernel implementations nor explicit work-distribution.To that end, we extended Marrow, an algorithmic skeleton framework for multi-GPUs, to support CPU computations and efficiently balance the work-load between devices. Our approach is based on an offline training execution that identifies the ideal work balance and platform configurations for a given application and input data size. The evaluation of this work shows that the combination of CPU and GPU devices can significantly boost the performance of our benchmarks in the tested environments, when compared to GPU-only executions.
Resumo:
Os serviços inicialmente idealizados para o mundo dos negócios, têm actualmente um espectro de utilização muito mais lato, facilitando assim a incorporação de software do exterior, sob a representação de serviço, por parte das aplicações. Os principais contribuidores para a emergente utilização de serviços são a proliferação dos dispositivos móveis, a crescente popularidade da computação da nuvem e a ubiquidade da Internet. Apesar deste estado da arte, a abstracção dos serviços continua, maioritariamente, a ser relegada para a camada do middleware. Consequentemente, este confinamento obstem o programador de ter privilégios para interagir com os serviços ao nível da linguagem. A inexistência deste nível de abstracção dificulta o deployment de aplicações dinâmicas. Como medida para tal, o objectivo do nosso trabalho é garantir suporte ao dinamismo e deployment de arquitecturas orientadas a serviços. Com esse propósito, vamos endereçar os problemas de incorporação dos serviços acessíveis pela Web e permitir operações de reconfiguração dos mesmos, nomeadamente, a ligação dinâmica, substituição do fornecedor de serviços e a gestão dinâmica de conjuntos de fornecedores de serviços.
Resumo:
The Intel R Xeon PhiTM is the first processor based on Intel’s MIC (Many Integrated Cores) architecture. It is a co-processor specially tailored for data-parallel computations, whose basic architectural design is similar to the ones of GPUs (Graphics Processing Units), leveraging the use of many integrated low computational cores to perform parallel computations. The main novelty of the MIC architecture, relatively to GPUs, is its compatibility with the Intel x86 architecture. This enables the use of many of the tools commonly available for the parallel programming of x86-based architectures, which may lead to a smaller learning curve. However, programming the Xeon Phi still entails aspects intrinsic to accelerator-based computing, in general, and to the MIC architecture, in particular. In this thesis we advocate the use of algorithmic skeletons for programming the Xeon Phi. Algorithmic skeletons abstract the complexity inherent to parallel programming, hiding details such as resource management, parallel decomposition, inter-execution flow communication, thus removing these concerns from the programmer’s mind. In this context, the goal of the thesis is to lay the foundations for the development of a simple but powerful and efficient skeleton framework for the programming of the Xeon Phi processor. For this purpose we build upon Marrow, an existing framework for the orchestration of OpenCLTM computations in multi-GPU and CPU environments. We extend Marrow to execute both OpenCL and C++ parallel computations on the Xeon Phi. We evaluate the newly developed framework, several well-known benchmarks, like Saxpy and N-Body, will be used to compare, not only its performance to the existing framework when executing on the co-processor, but also to assess the performance on the Xeon Phi versus a multi-GPU environment.