Biblioteca Digital

857 resultados para Design|Architecture

Algorithm-oriented design of efficient many-core architectures applied to dense matrix multiplication

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recent integrated circuit technologies have opened the possibility to design parallel architectures with hundreds of cores on a single chip. The design space of these parallel architectures is huge with many architectural options. Exploring the design space gets even more difficult if, beyond performance and area, we also consider extra metrics like performance and area efficiency, where the designer tries to design the architecture with the best performance per chip area and the best sustainable performance. In this paper we present an algorithm-oriented approach to design a many-core architecture. Instead of doing the design space exploration of the many core architecture based on the experimental execution results of a particular benchmark of algorithms, our approach is to make a formal analysis of the algorithms considering the main architectural aspects and to determine how each particular architectural aspect is related to the performance of the architecture when running an algorithm or set of algorithms. The architectural aspects considered include the number of cores, the local memory available in each core, the communication bandwidth between the many-core architecture and the external memory and the memory hierarchy. To exemplify the approach we did a theoretical analysis of a dense matrix multiplication algorithm and determined an equation that relates the number of execution cycles with the architectural parameters. Based on this equation a many-core architecture has been designed. The results obtained indicate that a 100 mm(2) integrated circuit design of the proposed architecture, using a 65 nm technology, is able to achieve 464 GFLOPs (double precision floating-point) for a memory bandwidth of 16 GB/s. This corresponds to a performance efficiency of 71 %. Considering a 45 nm technology, a 100 mm(2) chip attains 833 GFLOPs which corresponds to 84 % of peak performance These figures are better than those obtained by previous many-core architectures, except for the area efficiency which is limited by the lower memory bandwidth considered. The results achieved are also better than those of previous state-of-the-art many-cores architectures designed specifically to achieve high performance for matrix multiplication.

Sparse matrix multiplication on a reconfigurable many-core architecture

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Sparse matrix-vector multiplication (SMVM) is a fundamental operation in many scientific and engineering applications. In many cases sparse matrices have thousands of rows and columns where most of the entries are zero, while non-zero data is spread over the matrix. This sparsity of data locality reduces the effectiveness of data cache in general-purpose processors quite reducing their performance efficiency when compared to what is achieved with dense matrix multiplication. In this paper, we propose a parallel processing solution for SMVM in a many-core architecture. The architecture is tested with known benchmarks using a ZYNQ-7020 FPGA. The architecture is scalable in the number of core elements and limited only by the available memory bandwidth. It achieves performance efficiencies up to almost 70% and better performances than previous FPGA designs.

System-on-chip field-programmable gate array design for onboard real-time hyperspectral unmixing

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hyperspectral instruments have been incorporated in satellite missions, providing large amounts of data of high spectral resolution of the Earth surface. This data can be used in remote sensing applications that often require a real-time or near-real-time response. To avoid delays between hyperspectral image acquisition and its interpretation, the last usually done on a ground station, onboard systems have emerged to process data, reducing the volume of information to transfer from the satellite to the ground station. For this purpose, compact reconfigurable hardware modules, such as field-programmable gate arrays (FPGAs), are widely used. This paper proposes an FPGA-based architecture for hyperspectral unmixing. This method based on the vertex component analysis (VCA) and it works without a dimensionality reduction preprocessing step. The architecture has been designed for a low-cost Xilinx Zynq board with a Zynq-7020 system-on-chip FPGA-based on the Artix-7 FPGA programmable logic and tested using real hyperspectral data. Experimental results indicate that the proposed implementation can achieve real-time processing, while maintaining the methods accuracy, which indicate the potential of the proposed platform to implement high-performance, low-cost embedded systems, opening perspectives for onboard hyperspectral image processing.

A Novel Run-Time Monitoring Architecture for Safe and Efficient Inline Monitoring

Relevância:

30.00% 30.00%

Publicador:

Resumo:

20th International Conference on Reliable Software Technologies - Ada-Europe 2015 (Ada-Europe 2015), Madrid, Spain.

ROAZ and ROAZ II Autonomous Surface Vehicle Design and Implementation

Relevância:

30.00% 30.00%

Publicador:

Resumo:

International Lifesaving Congress 2007, La Coruna, Spain, December, 2007

ROAZ Autonomous Surface Vehicle Design and Implementation

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The design of an Autonomous Surface Vehicle for operation in river and estuarine scenarios is presented. Multiple operations with autonomous underwater vehicles and support to AUV missions are one of the main design goals in the ROAZ system. The mechanical design issues are discussed. Hardware, software and implementation status are described along with the control and navigation system architecture. Some preliminary test results concerning a custom developed thruster are presented along with hydrodynamic drag calculations by the use of computer fluid dynamic methods.

Design of a low-voltage CMOS RF receiver for energy harvesting sensor node

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this thesis a CMOS low-power and low-voltage RF receiver front-end is presented. The main objective is to design this RF receiver so that it can be powered by a piezoelectric energy harvesting power source, included in a Wireless Sensor Node application. For this type of applications the major requirements are: the low-power and low-voltage operation, the reduced area and cost and the simplicity of the architecture. The system key blocks are the LNA and the mixer, which are studied and optimized with greater detail, achieving a good linearity, a wideband operation and a reduced introduction of noise. A wideband balun LNA with noise and distortion cancelling is designed to work at a 0.6 V supply voltage, in conjunction with a double-balanced passive mixer and subsequent TIA block. The passive mixer operates in current mode, allowing a minimal introduction of voltage noise and a good linearity. The receiver analog front-end has a total voltage conversion gain of 31.5 dB, a 0.1 - 4.3 GHz bandwidth, an IIP3 value of -1.35 dBm, and a noise figure lower than 9 dB. The total power consumption is 1.9 mW and the die area is 305x134.5 m2, using a standard 130 nm CMOS technology.

Systems biology approaches for the design of novel Saccharomyces cerevisiae winemaking strains for enhanced flavour compounds synthesis

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Tese de Doutoramento em Biologia Ambiental e Molecular

Multithreading RTOS processor design

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Tese de Doutoramento Plano Doutoral em Engenharia Eletrónica e de Computadores.

A modular traffic sampling architecture for flexible network measurements

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertação de Mestrado (Programa Doutoral em Informática)

An evolutionary approach to the use of Petri net based models: from parallel controllers to HW/SW co-design

Relevância:

30.00% 30.00%

Publicador:

Resumo:

"A workshop within the 19th International Conference on Applications and Theory of Petri Nets - ICATPN’1998"

Parallel Architecture Prototype for 60 GHz High Data Rate Wireless Single Carrier Receiver

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Nowadays a huge attention of the academia and research teams is attracted to the potential of the usage of the 60 GHz frequency band in the wireless communications. The use of the 60GHz frequency band offers great possibilities for wide variety of applications that are yet to be implemented. These applications also imply huge implementation challenges. Such example is building a high data rate transceiver which at the same time would have very low power consumption. In this paper we present a prototype of Single Carrier -SC transceiver system, illustrating a brief overview of the baseband design, emphasizing the most important decisions that need to be done. A brief overview of the possible approaches when implementing the equalizer, as the most complex module in the SC transceiver, is also presented. The main focus of this paper is to suggest a parallel architecture for the receiver in a Single Carrier communication system. This would provide higher data rates that the communication system canachieve, for a price of higher power consumption. The suggested architecture of such receiver is illustrated in this paper,giving the results of its implementation in comparison with its corresponding serial implementation.

Software for Explicitly Parallel Memory-Centric Processor Architecture

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Advances in computer memory technology justify research towards new and different views on computer organization. This paper proposes a novel memory-centric computing architecture with the goal to merge memory and processing elements in order to provide better conditions for parallelization and performance. The paper introduces the architectural concepts and afterwards shows the design and implementation of a corresponding assembler and simulator.

System analysis of a Peer-to-Peer Video-on-Demand architecture: Kangaroo

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Architectural design and deployment of Peer-to-Peer Video-on-Demand (P2PVoD) systems which support VCR functionalities is attracting the interest of an increasing number of research groups within the scientific community; especially due to the intrinsic characteristics of such systems and the benefits that peers could provide at reducing the server load. This work focuses on the performance analysis of a P2P-VoD system considering user behaviors obtained from real traces together with other synthetic user patterns. The experiments performed show that it is feasible to achieve a performance close to the best possible. Future work will consider monitoring the physical characteristics of the network in order to improve the design of different aspects of a VoD system.

Design of Novel Frequency Generation Circuit with Application to Ultra-Wideband Distributed Oscillators

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Given the urgence of a new paradigm in wireless digital trasmission which should allow for higher bit rate, lower latency and tigher delay constaints, it has been proposed to investigate the fundamental building blocks that at the circuital/device level, will boost the change towards a more efficient network architecture, with high capacity, higher bandwidth and a more satisfactory end user experience. At the core of each transciever, there are inherently analog devices capable of providing the carrier signal, the oscillators. It is strongly believed that many limitations in today's communication protocols, could be relieved by permitting high carrier frequency radio transmission, and having some degree of reconfigurability. This led us to studying distributed oscillator architectures which work in the microwave range and possess wideband tuning capability. As microvave oscillators are essentially nonlinear devices, a full nonlinear analyis, synthesis, and optimization had to be considered for their implementation. Consequently, all the most used nonlinear numerical techniques in commercial EDA software had been reviewed. An application of all the aforementioned techniques has been shown, considering a systems of three coupled oscillator ("triple push" oscillator) in which the stability of the various oscillating modes has been studied. Provided that a certain phase distribution is maintained among the oscillating elements, this topology permits a rise in the output power of the third harmonic; nevertheless due to circuit simmetry, "unwanted" oscillating modes coexist with the intenteded one. Starting with the necessary background on distributed amplification and distributed oscillator theory, the design of a four stage reverse mode distributed voltage controlled oscillator (DVCO) using lumped elments has been presented. All the design steps have been reported and for the first time a method for an optimized design with reduced variations in the output power has been presented. Ongoing work is devoted to model a wideband DVCO and to implement a frequency divider.

«
1
2
...
7
8
9
10
11
12
13
...
57
58
»