991 resultados para parallel architecture
Resumo:
Monitoring systems have traditionally been developed with rigid objectives and functionalities, and tied to specific languages, libraries and run-time environments. There is a need for more flexible monitoring systems which can be easily adapted to distinct requirements. On-line monitoring has been considered as increasingly important for observation and control of a distributed application. In this paper we discuss monitoring interfaces and architectures which support more extensible monitoring and control services. We describe our work on the development of a distributed monitoring infrastructure, and illustrate how it eases the implementation of a complex distributed debugging architecture. We also discuss several issues concerning support for tool interoperability and illustrate how the cooperation among multiple concurrent tools can ease the task of distributed debugging.
Resumo:
This paper presents a new parallel implementation of a previously hyperspectral coded aperture (HYCA) algorithm for compressive sensing on graphics processing units (GPUs). HYCA method combines the ideas of spectral unmixing and compressive sensing exploiting the high spatial correlation that can be observed in the data and the generally low number of endmembers needed in order to explain the data. The proposed implementation exploits the GPU architecture at low level, thus taking full advantage of the computational power of GPUs using shared memory and coalesced accesses to memory. The proposed algorithm is evaluated not only in terms of reconstruction error but also in terms of computational performance using two different GPU architectures by NVIDIA: GeForce GTX 590 and GeForce GTX TITAN. Experimental results using real data reveals signficant speedups up with regards to serial implementation.
Resumo:
This paper proposes an FPGA-based architecture for onboard hyperspectral unmixing. This method based on the Vertex Component Analysis (VCA) has several advantages, namely it is unsupervised, fully automatic, and it works without dimensionality reduction (DR) pre-processing step. The architecture has been designed for a low cost Xilinx Zynq board with a Zynq-7020 SoC FPGA based on the Artix-7 FPGA programmable logic and tested using real hyperspectral datasets. Experimental results indicate that the proposed implementation can achieve real-time processing, while maintaining the methods accuracy, which indicate the potential of the proposed platform to implement high-performance, low cost embedded systems.
Resumo:
Hyperspectral imaging can be used for object detection and for discriminating between different objects based on their spectral characteristics. One of the main problems of hyperspectral data analysis is the presence of mixed pixels, due to the low spatial resolution of such images. This means that several spectrally pure signatures (endmembers) are combined into the same mixed pixel. Linear spectral unmixing follows an unsupervised approach which aims at inferring pure spectral signatures and their material fractions at each pixel of the scene. The huge data volumes acquired by such sensors put stringent requirements on processing and unmixing methods. This paper proposes an efficient implementation of a unsupervised linear unmixing method on GPUs using CUDA. The method finds the smallest simplex by solving a sequence of nonsmooth convex subproblems using variable splitting to obtain a constraint formulation, and then applying an augmented Lagrangian technique. The parallel implementation of SISAL presented in this work exploits the GPU architecture at low level, using shared memory and coalesced accesses to memory. The results herein presented indicate that the GPU implementation can significantly accelerate the method's execution over big datasets while maintaining the methods accuracy.
Resumo:
Many Hyperspectral imagery applications require a response in real time or near-real time. To meet this requirement this paper proposes a parallel unmixing method developed for graphics processing units (GPU). This method is based on the vertex component analysis (VCA), which is a geometrical based method highly parallelizable. VCA is a very fast and accurate method that extracts endmember signatures from large hyperspectral datasets without the use of any a priori knowledge about the constituent spectra. Experimental results obtained for simulated and real hyperspectral datasets reveal considerable acceleration factors, up to 24 times.
Resumo:
This paper presents a methodology for multi-objective day-ahead energy resource scheduling for smart grids considering intensive use of distributed generation and Vehicle- To-Grid (V2G). The main focus is the application of weighted Pareto to a multi-objective parallel particle swarm approach aiming to solve the dual-objective V2G scheduling: minimizing total operation costs and maximizing V2G income. A realistic mathematical formulation, considering the network constraints and V2G charging and discharging efficiencies is presented and parallel computing is applied to the Pareto weights. AC power flow calculation is included in the metaheuristics approach to allow taking into account the network constraints. A case study with a 33-bus distribution network and 1800 V2G resources is used to illustrate the performance of the proposed method.
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Informática
Resumo:
The forthcoming smart grids are comprised of integrated microgrids operating in grid-connected and isolated mode with local generation, storage and demand response (DR) programs. The proposed model is based on three successive complementary steps for power transaction in the market environment. The first step is characterized as a microgrid’s internal market; the second concerns negotiations between distinct interconnected microgrids; and finally, the third refers to the actual electricity market. The proposed approach is modeled and tested using a MAS framework directed to the study of the smart grids environment, including the simulation of electricity markets. This is achieved through the integration of the proposed approach with the MASGriP (Multi-Agent Smart Grid Platform) system.
Resumo:
Coarse Grained Reconfigurable Architectures (CGRAs) are emerging as enabling platforms to meet the high performance demanded by modern applications (e.g. 4G, CDMA, etc.). Recently proposed CGRAs offer time-multiplexing and dynamic applications parallelism to enhance device utilization and reduce energy consumption at the cost of additional memory (up to 50% area of the overall platform). To reduce the memory overheads, novel CGRAs employ either statistical compression, intermediate compact representation, or multicasting. Each compaction technique has different properties (i.e. compression ratio, decompression time and decompression energy) and is best suited for a particular class of applications. However, existing research only deals with these methods separately. Moreover, they only analyze the compaction ratio and do not evaluate the associated energy overheads. To tackle these issues, we propose a polymorphic compression architecture that interleaves these techniques in a unique platform. The proposed architecture allows each application to take advantage of a separate compression/decompression hierarchy (consisting of various types and implementations of hardware/software decoders) tailored to its needs. Simulation results, using different applications (FFT, Matrix multiplication, and WLAN), reveal that the choice of compression hierarchy has a significant impact on compression ratio (up to 52%), decompression energy (up to 4 orders of magnitude), and configuration time (from 33 n to 1.5 s) for the tested applications. Synthesis results reveal that introducing adaptivity incurs negligible additional overheads (1%) compared to the overall platform area.
Resumo:
Euromicro Conference on Digital System Design (DSD 2015), Funchal, Portugal.
Resumo:
6th Real-Time Scheduling Open Problems Seminar (RTSOPS 2015), Lund, Sweden.
Resumo:
20th International Conference on Reliable Software Technologies - Ada-Europe 2015 (Ada-Europe 2015), Madrid, Spain.
Resumo:
The 30th ACM/SIGAPP Symposium On Applied Computing (SAC 2015). 13 to 17, Apr, 2015, Embedded Systems. Salamanca, Spain.
Resumo:
Distributed real-time systems such as automotive applications are becoming larger and more complex, thus, requiring the use of more powerful hardware and software architectures. Furthermore, those distributed applications commonly have stringent real-time constraints. This implies that such applications would gain in flexibility if they were parallelized and distributed over the system. In this paper, we consider the problem of allocating fixed-priority fork-join Parallel/Distributed real-time tasks onto distributed multi-core nodes connected through a Flexible Time Triggered Switched Ethernet network. We analyze the system requirements and present a set of formulations based on a constraint programming approach. Constraint programming allows us to express the relations between variables in the form of constraints. Our approach is guaranteed to find a feasible solution, if one exists, in contrast to other approaches based on heuristics. Furthermore, approaches based on constraint programming have shown to obtain solutions for these type of formulations in reasonable time.
Resumo:
Demo in Workshop on ns-3 (WNS3 2015). 13 to 14, May, 2015. Castelldefels, Spain.