50 resultados para Stair Nested Designs
Resumo:
This paper presents a clocking pipeline technique referred to as a single-pulse pipeline (PP-Pipeline) and applies it to the problem of mapping pipelined circuits to a Field Programmable Gate Array (FPGA). A PP-pipeline replicates the operation of asynchronous micropipelined control mechanisms using synchronous-orientated logic resources commonly found in FPGA devices. Consequently, circuits with an asynchronous-like pipeline operation can be efficiently synthesized using a synchronous design methodology. The technique can be extended to include data-completion circuitry to take advantage of variable data-completion processing time in synchronous pipelined designs. It is also shown that the PP-pipeline reduces the clock tree power consumption of pipelined circuits. These potential applications are demonstrated by post-synthesis simulation of FPGA circuits. (C) 2004 Elsevier B.V. All rights reserved.
Resumo:
This paper makes a contribution in bridging the theory and practice of the polyhedral model for designing parallel algorithms. Although the theory of polyhedral model is well developed, designers of massively parallel algorithms are unable to benefit from the theory due to the lack of software tools that incorporate the wide range of transformations that are possible in the model. The Uniformization tool that we developed was the first to integrate a number of techniques and to completely automate the transformation step allowing designers to explore a wide range of feasible designs from high-level specifications.
Resumo:
Since its introduction in 1993, the Message Passing Interface (MPI) has become a de facto standard for writing High Performance Computing (HPC) applications on clusters and Massively Parallel Processors (MPPs). The recent emergence of multi-core processor systems presents a new challenge for established parallel programming paradigms, including those based on MPI. This paper presents a new Java messaging system called MPJ Express. Using this system, we exploit multiple levels of parallelism - messaging and threading - to improve application performance on multi-core processors. We refer to our approach as nested parallelism. This MPI-like Java library can support nested parallelism by using Java or Java OpenMP (JOMP) threads within an MPJ Express process. Practicality of this approach is assessed by porting to Java a massively parallel structure formation code from Cosmology called Gadget-2. We introduce nested parallelism in the Java version of the simulation code and report good speed-ups. To the best of our knowledge it is the first time this kind of hybrid parallelism is demonstrated in a high performance Java application. (C) 2009 Elsevier Inc. All rights reserved.
Resumo:
Improving methodology for Phase I dose-finding studies is currently of great interest in pharmaceutical and medical research. This article discusses the current atmosphere and attitude towards adaptive designs and focuses on the influence of Bayesian approaches.
Resumo:
This paper presents the evaluation in power consumption of a clocking technique for pipelined designs. The technique shows a dynamic power consumption saving of around 30% over a conventional global clocking mechanism. The results were obtained from a series of experiments of a systolic circuit implemented in Virtex-II devices. The conversion from a global-clocked pipelined design to the proposed technique is straightforward, preserving the original datapath design. The savings can be used immediately either as a power reduction benefit or to increase the frequency of operation of a design for the same power consumption.
Resumo:
This paper presents a simple clocking technique to migrate classical synchronous pipelined designs to a synchronous functional-equivalent alternative system in the context of FPGAs. When the new pipelined design runs at the same throughput of the original design, around 30% better mW/MHz ratio was observed in Virtex-based FPGA circuits. The evaluation is done using a simple but representative and practical systolic design as an example. The technique in essence is a simple replacement of the clocking mechanism for the pipe-storage elements; however no extra design effort is needed. The results show that the proposed technique allows immediate power and area-time savings of existing designs rather than exploring potential benefits by a new logic design to the problem using the classic pipeline clocking mechanism.
Resumo:
Four-dimensional variational data assimilation (4D-Var) is used in environmental prediction to estimate the state of a system from measurements. When 4D-Var is applied in the context of high resolution nested models, problems may arise in the representation of spatial scales longer than the domain of the model. In this paper we study how well 4D-Var is able to estimate the whole range of spatial scales present in one-way nested models. Using a model of the one-dimensional advection–diffusion equation we show that small spatial scales that are observed can be captured by a 4D-Var assimilation, but that information in the larger scales may be degraded. We propose a modification to 4D-Var which allows a better representation of these larger scales.
Resumo:
In recent years, there has been a drive to save development costs and shorten time-to-market of new therapies. Research into novel trial designs to facilitate this goal has led to, amongst other approaches, the development of methodology for seamless phase II/III designs. Such designs allow treatment or dose selection at an interim analysis and comparative evaluation of efficacy with control, in the same study. Methods have gained much attention because of their potential advantages compared to conventional drug development programmes with separate trials for individual phases. In this article, we review the various approaches to seamless phase II/III designs based upon the group-sequential approach, the combination test approach and the adaptive Dunnett method. The objective of this article is to describe the approaches in a unified framework and highlight their similarities and differences to allow choice of an appropriate methodology by a trialist considering conducting such a trial.
Resumo:
The temporal variability of the atmosphere through which radio waves pass in the technique of differential radar interferometry can seriously limit the accuracy with which the method can measure surface motion. A forward, nested mesoscale model of the atmosphere can be used to simulate the variable water content along the radar path and the resultant phase delays. Using this approach we demonstrate how to correct an interferogram of Mount Etna in Sicily associated with an eruption in 2004-5. The regional mesoscale model (Unified Model) used to simulate the atmosphere at higher resolutions consists of four nested domains increasing in resolution (12, 4, 1, 0.3 km), sitting within the analysis version of a global numerical model that is used to initiate the simulation. Using the high resolution 3D model output we compute the surface pressure, temperature and the water vapour, liquid and solid water contents, enabling the dominant hydrostatic and wet delays to be calculated at specific times corresponding to the acquisition of the radar data. We can also simulate the second-order delay effects due to liquid water and ice.
Resumo:
Hybrid multiprocessor architectures which combine re-configurable computing and multiprocessors on a chip are being proposed to transcend the performance of standard multi-core parallel systems. Both fine-grained and coarse-grained parallel algorithm implementations are feasible in such hybrid frameworks. A compositional strategy for designing fine-grained multi-phase regular processor arrays to target hybrid architectures is presented in this paper. The method is based on deriving component designs using classical regular array techniques and composing the components into a unified global design. Effective designs with phase-changes and data routing at run-time are characteristics of these designs. In order to describe the data transfer between phases, the concept of communication domain is introduced so that the producer–consumer relationship arising from multi-phase computation can be treated in a unified way as a data routing phase. This technique is applied to derive new designs of multi-phase regular arrays with different dataflow between phases of computation.