Biblioteca Digital

100 resultados para MICROPROCESSORS

A fault-tolerant multi-transputer architecture

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A new fault-tolerant multi-transputer architecture capable of tolerating failure of any one component in the system is described. In the proposed architecture the processing nodes are automatically reconfigured in the event of a fault and the computations continue from the stage where the fault occurred. The process of reconfiguration is transparent to the user, and the identity of the failed component is communicated to the user along with the results of computations. Parallel solution of a typical engineering problem involving solution of Laplace's equation by the boundary element method has been implemented. The performance of the architecture in the event of faults has been investigated.

Dual-DSP system for signal and image processing

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The design of a dual-DSP microprocessor system and its application for parallel FFT and two-dimensional convolution are explained. The system is based on a master-salve configuration. Two ADSP-2101s are configured as slave processors and a PC/AT serves as the master. The master serves as a control processor to transfer the program code and data to the DSPs. The system architecture and the algorithms for the two applications, viz. FFT and two-dimensional convolutions, are discussed.

Energy-Efficient Fault Tolerance in Chip Multiprocessors Using Critical Value Forwarding

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Relentless CMOS scaling coupled with lower design tolerances is making ICs increasingly susceptible to wear-out related permanent faults and transient faults, necessitating on-chip fault tolerance in future chip microprocessors (CMPs). In this paper we introduce a new energy-efficient fault-tolerant CMP architecture known as Redundant Execution using Critical Value Forwarding (RECVF). RECVF is based on two observations: (i) forwarding critical instruction results from the leading to the trailing core enables the latter to execute faster, and (ii) this speedup can be exploited to reduce energy consumption by operating the trailing core at a lower voltage-frequency level. Our evaluation shows that RECVF consumes 37% less energy than conventional dual modular redundant (DMR) execution of a program. It consumes only 1.26 times the energy of a non-fault-tolerant baseline and has a performance overhead of just 1.2%.

A Predictive Perfomance Model for Superscalar Processors

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Designing and optimizing high performance microprocessors is an increasingly difficult task due to the size and complexity of the processor design space, high cost of detailed simulation and several constraints that a processor design must satisfy. In this paper, we propose the use of empirical non-linear modeling techniques to assist processor architects in making design decisions and resolving complex trade-offs. We propose a procedure for building accurate non-linear models that consists of the following steps: (i) selection of a small set of representative design points spread across processor design space using latin hypercube sampling, (ii) obtaining performance measures at the selected design points using detailed simulation, (iii) building non-linear models for performance using the function approximation capabilities of radial basis function networks, and (iv) validating the models using an independently and randomly generated set of design points. We evaluate our model building procedure by constructing non-linear performance models for programs from the SPEC CPU2000 benchmark suite with a microarchitectural design space that consists of 9 key parameters. Our results show that the models, built using a relatively small number of simulations, achieve high prediction accuracy (only 2.8% error in CPI estimates on average) across a large processor design space. Our models can potentially replace detailed simulation for common tasks such as the analysis of key microarchitectural trends or searches for optimal processor design points.

Energy-efficient redundant execution for chip multiprocessors

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Relentless CMOS scaling coupled with lower design tolerances is making ICs increasingly susceptible to wear-out related permanent faults and transient faults, necessitating on-chip fault tolerance in future chip microprocessors (CMPs). In this paper, we describe a power-efficient architecture for redundant execution on chip multiprocessors (CMPs) which when coupled with our per-core dynamic voltage and frequency scaling (DVFS) algorithm significantly reduces the energy overhead of redundant execution without sacrificing performance. Our evaluation shows that this architecture has a performance overhead of only 0.3% and consumes only 1.48 times the energy of a non-fault-tolerant baseline.

Traffic engineered NoC for streaming applications

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Streaming applications demand hard bandwidth and throughput guarantees in a multiprocessor environment amidst resource competing processes. We present a Label Switching based Network-on-Chip (LS-NoC) motivated by throughput guarantees offered by bandwidth reservation. Label switching is a packet relaying technique in which individual packets carry route information in the form of labels. A centralized LS-NoC Management framework engineers traffic into Quality of Service (QoS) guaranteed routes. LS-NoC caters to the requirements of streaming applications where communication channels are fixed over the lifetime of the application. The proposed NoC framework inherently supports heterogeneous and ad hoc system-on-chips. The LS-NoC can be used in conjunction with conventional best effort NoC as a QoS guaranteed communication network or as a replacement to the conventional NoC. A multicast, broadcast capable label switched router for the LS-NoC has been designed. A 5 port, 256 bit data bus, 4 bit label router occupies 0.431 mm(2) in 130 nm and delivers peak bandwidth of 80 Gbits/s per link at 312.5 MHz. Bandwidth and latency guarantees of LS-NoC have been demonstrated on traffic from example streaming applications and on constant and variable bit rate traffic patterns. LS-NoC was found to have a competitive AreaxPower/Throughput figure of merit with state-of-the-art NoCs providing QoS. Circuit switching with link sharing abilities and support for asynchronous operation make LS-NoC a desirable choice for QoS servicing in chip multiprocessors. (C) 2013 Elsevier B.V. All rights reserved.

IBM PC Data Acquisition and Processing Software Evaluation

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Commercially available software packages for IBM PC-compatibles are evaluated to use for data acquisition and processing work. Moss Landing Marine Laboratories (MLML) acquired computers since 1978 to use on shipboard data acquisition (Le. CTD, radiometric, etc.) and data processing. First Hewlett-Packard desktops were used then a transition to the DEC VAXstations, with software developed mostly by the author and others at MLML (Broenkow and Reaves, 1993; Feinholz and Broenkow, 1993; Broenkow et al, 1993). IBM PC were at first very slow and limited in available software, so they were not used in the early days. Improved technology such as higher speed microprocessors and a wide range of commercially available software made use of PC more reasonable today. MLML is making a transition towards using the PC for data acquisition and processing. Advantages are portability and available outside support.

Sistema de control multipropósito programable por pulsación de teclas con álgebra RPN.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

[ES]Para la elaboración del presente proyecto, primeramente se explicara los requerimientos técnicos y el origen del microprocesador a utilizar para poder situar y centrar el tema del trabajo. Una vez acotado y delimitado el tema objeto de estudio, se planteara una arquitectura de software sobre la posibilidad de generar unas “pseudolibrerias” de mayor nivel de programación que el ensamblador. Posteriormente, se verificara la posible viabilidad o no de tal planteamiento, exponiendo sus resultados y las consideraciones oportunas a las que nos ha llevado su estudio. Para ello se analizara en una primera instancia el microprocesador a utilizar, que será el PIC16F887, centrándonos en el debido al amplio manejo y conocimiento que poseemos sobre este microprocesador. El objeto de este escrito será presentar una oferta económica relativa al desarrollo e instalación de dichos microprocesadores para la mejora en el ámbito industrial. Finalmente, realizaremos un estudio sobre la implementación de este tipo de arquitectura software en diferentes microprocesadores de mayores prestaciones, estudiando si la infraestructura será eficiente, funcional y económica.

P-notation: High level description language for software design

Relevância:

10.00% 10.00%

Publicador:

Load-balancing for mesh-based applications on heterogeneous cluster computers

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper discusses load-balancing issues when using heterogeneous cluster computers. There is a growing trend towards the use of commodity microprocessor clusters. Although today's microprocessors have reached a theoretical peak performance in the range of one GFLOPS/s, heterogeneous clusters of commodity processors are amongst the most challenging parallel systems to programme efficiently. We will outline an approach for optimising the performance of parallel mesh-based applications for heterogeneous cluster computers and present case studies with the GeoFEM code. The focus is on application cost monitoring and load balancing using the DRAMA library.

HIDE: A hardware intelligent description environment

Relevância:

10.00% 10.00%

Publicador:

Evaluation of Random Delay Insertion against DPA on FPGAs

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Side-channel attacks (SCA) threaten electronic cryptographic devices and can be carried out by monitoring the physical characteristics of security circuits. Differential Power Analysis (DPA) is one the most widely studied side-channel attacks. Numerous countermeasure techniques, such as Random Delay Insertion (RDI), have been proposed to reduce the risk of DPA attacks against cryptographic devices. The RDI technique was first proposed for microprocessors but it was shown to be unsuccessful when implemented on smartcards as it was vulnerable to a variant of the DPA attack known as the Sliding-Window DPA attack.Previous research by the authors investigated the use of the RDI countermeasure for Field Programmable Gate Array (FPGA) based cryptographic devices. A split-RDI technique wasproposed to improve the security of the RDI countermeasure. A set of critical parameters wasalso proposed that could be utilized in the design stage to optimize a security algorithm designwith RDI in terms of area, speed and power. The authors also showed that RDI is an efficientcountermeasure technique on FPGA in comparison to other countermeasures.In this article, a new RDI logic design is proposed that can be used to cost-efficiently implementRDI on FPGA devices. Sliding-Window DPA and realignment attacks, which were shown to beeffective against RDI implemented on smartcard devices, are performed on the improved RDIFPGA implementation. We demonstrate that these attacks are unsuccessful and we also proposea realignment technique that can be used to demonstrate the weakness of RDI implementations.

Advanced educational parallel DSP system based on TMS320C25 processors

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes the design, application, and evaluation of a user friendly, flexible, scalable and inexpensive Advanced Educational Parallel (AdEPar) digital signal processing (DSP) system based on TMS320C25 digital processors to implement DSP algorithms. This system will be used in the DSP laboratory by graduate students to work on advanced topics such as developing parallel DSP algorithms. The graduating senior students who have gained some experience in DSP can also use the system. The DSP laboratory has proved to be a useful tool in the hands of the instructor to teach the mathematically oriented topics of DSP that are often difficult for students to grasp. The DSP laboratory with assigned projects has greatly improved the ability of the students to understand such complex topics as the fast Fourier transform algorithm, linear and circular convolution, the theory and design of infinite impulse response (IIR) and finite impulse response (FIR) filters. The user friendly PC software support of the AdEPar system makes it easy to develop DSP programs for students. This paper gives the architecture of the AdEPar DSP system. The communication between processors and the PC-DSP processor communication are explained. The parallel debugger kernels and the restrictions of the system are described. The programming in the AdEPar is explained, and two benchmarks (parallel FFT and DES) are presented to show the system performance.

Clustered indexing for branch predictors

Relevância:

10.00% 10.00%

Publicador:

Resumo:

As a result of resource limitations, state in branch predictors is frequently shared between uncorrelated branches. This interference can significantly limit prediction accuracy. In current predictor designs, the branches sharing prediction information are determined by their branch addresses and thus branch groups are arbitrarily chosen during compilation. This feasibility study explores a more analytic and systematic approach to classify branches into clusters with similar behavioral characteristics. We present several ways to incorporate this cluster information as an additional information source in branch predictors.

A comparative analysis of fault injection methods via enhanced on-chip debug infrastructures

Relevância:

10.00% 10.00%

Publicador:

Resumo:

On-chip debug (OCD) features are frequently available in modern microprocessors. Their contribution to shorten the time-to-market justifies the industry investment in this area, where a number of competing or complementary proposals are available or under development, e.g. NEXUS, CJTAG, IJTAG. The controllability and observability features provided by OCD infrastructures provide a valuable toolbox that can be used well beyond the debugging arena, improving the return on investment rate by diluting its cost across a wider spectrum of application areas. This paper discusses the use of OCD features for validating fault tolerant architectures, and in particular the efficiency of various fault injection methods provided by enhanced OCD infrastructures. The reference data for our comparative study was captured on a workbench comprising the 32-bit Freescale MPC-565 microprocessor, an iSYSTEM IC3000 debugger (iTracePro version) and the Winidea 2005 debugging package. All enhanced OCD infrastructures were implemented in VHDL and the results were obtained by simulation within the same fault injection environment. The focus of this paper is on the comparative analysis of the experimental results obtained for various OCD configurations and debugging scenarios.

«
1
2
3
4
5
6
7
»