975 resultados para design space exploration


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we explore an implementation of a high-throughput, streaming application on REDEFINE-v2, which is an enhancement of REDEFINE. REDEFINE is a polymorphic ASIC combining the flexibility of a programmable solution with the execution speed of an ASIC. In REDEFINE Compute Elements are arranged in an 8x8 grid connected via a Network on Chip (NoC) called RECONNECT, to realize the various macrofunctional blocks of an equivalent ASIC. For a 1024-FFT we carry out an application-architecture design space exploration by examining the various characterizations of Compute Elements in terms of the size of the instruction store. We further study the impact by using application specific, vectorized FUs. By setting up different partitions of the FFT algorithm for persistent execution on REDEFINE-v2, we derive the benefits of setting up pipelined execution for higher performance. The impact of the REDEFINE-v2 micro-architecture for any arbitrary N-point FFT (N > 4096) FFT is also analyzed. We report the various algorithm-architecture tradeoffs in terms of area and execution speed with that of an ASIC implementation. In addition we compare the performance gain with respect to a GPP.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the world of high performance computing huge efforts have been put to accelerate Numerical Linear Algebra (NLA) kernels like QR Decomposition (QRD) with the added advantage of reconfigurability and scalability. While popular custom hardware solution in form of systolic arrays can deliver high performance, they are not scalable, and hence not commercially viable. In this paper, we show how systolic solutions of QRD can be realized efficiently on REDEFINE, a scalable runtime reconfigurable hardware platform. We propose various enhancements to REDEFINE to meet the custom need of accelerating NLA kernels. We further do the design space exploration of the proposed solution for any arbitrary application of size n × n. We determine the right size of the sub-array in accordance with the optimal pipeline depth of the core execution units and the number of such units to be used per sub-array.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Efficiently exploring exponential-size architectural design spaces with many interacting parameters remains an open problem: the sheer number of experiments required renders detailed simulation intractable.We attack this via an automated approach that builds accurate predictive models. We simulate sampled points, using results to teach our models the function describing relationships among design parameters. The models can be queried and are very fast, enabling efficient design tradeoff discovery. We validate our approach via two uniprocessor sensitivity studies, predicting IPC with only 1–2% error. In an experimental study using the approach, training on 1% of a 250-K-point CMP design space allows our models to predict performance with only 4–5% error. Our predictive modeling combines well with techniques that reduce the time taken by each simulation experiment, achieving net time savings of three-four orders of magnitude.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The design cycle for complex special-purpose computing systems is extremely costly and time-consuming. It involves a multiparametric design space exploration for optimization, followed by design verification. Designers of special purpose VLSI implementations often need to explore parameters, such as optimal bitwidth and data representation, through time-consuming Monte Carlo simulations. A prominent example of this simulation-based exploration process is the design of decoders for error correcting systems, such as the Low-Density Parity-Check (LDPC) codes adopted by modern communication standards, which involves thousands of Monte Carlo runs for each design point. Currently, high-performance computing offers a wide set of acceleration options that range from multicore CPUs to Graphics Processing Units (GPUs) and Field Programmable Gate Arrays (FPGAs). The exploitation of diverse target architectures is typically associated with developing multiple code versions, often using distinct programming paradigms. In this context, we evaluate the concept of retargeting a single OpenCL program to multiple platforms, thereby significantly reducing design time. A single OpenCL-based parallel kernel is used without modifications or code tuning on multicore CPUs, GPUs, and FPGAs. We use SOpenCL (Silicon to OpenCL), a tool that automatically converts OpenCL kernels to RTL in order to introduce FPGAs as a potential platform to efficiently execute simulations coded in OpenCL. We use LDPC decoding simulations as a case study. Experimental results were obtained by testing a variety of regular and irregular LDPC codes that range from short/medium (e.g., 8,000 bit) to long length (e.g., 64,800 bit) DVB-S2 codes. We observe that, depending on the design parameters to be simulated, on the dimension and phase of the design, the GPU or FPGA may suit different purposes more conveniently, thus providing different acceleration factors over conventional multicore CPUs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Caches are known to consume up to half of all system power in embedded processors. Co-optimizing performance and power of the cache subsystems is therefore an important step in the design of embedded systems, especially those employing application specific instruction processors. In this project, we propose an analytical cache model that succinctly captures the miss performance of an application over the entire cache parameter space. Unlike exhaustive trace driven simulation, our model requires that the program be simulated once so that a few key characteristics can be obtained. Using these application-dependent characteristics, the model can span the entire cache parameter space consisting of cache sizes, associativity and cache block sizes. In our unified model, we are able to cater for direct-mapped, set and fully associative instruction, data and unified caches. Validation against full trace-driven simulations shows that our model has a high degree of fidelity. Finally, we show how the model can be coupled with a power model for caches such that one can very quickly decide on pareto-optimal performance-power design points for rapid design space exploration.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Exploration with a generative formalism must necessarily account for the nature of interaction between humans and the design space explorer. Established accounts of design interaction are made complicated by two propositions in Woodbury and Burrow's Keynote on design space exploration. First, the emphasis on the primacy of the design space as an ordered collection of partial designs (version, alternatives, extensions). Few studies exist in the design interaction literature on working with multiple threads simultaneously. Second, the need to situate, aid, and amplify human design intentions using computational tools. Although specific research and practice tools on amplification (sketching, generation, variation) have had success, there is a lack of generic, flexible, interoperable, and extensible representation to support amplification. This paper addresses the above, working with design threads and computer-assisted design amplification through a theoretical model of dialogue based on Grice's model of rational conversation. Using the concept of mixed initiative, the paper presents a visual notation for representing dialogue between designer and design space formalism through abstract examples of exploration tasks and dialogue integration.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Eliminadas las páginas en blanco

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Fluid bed granulation is a key pharmaceutical process which improves many of the powder properties for tablet compression. Dry mixing, wetting and drying phases are included in the fluid bed granulation process. Granules of high quality can be obtained by understanding and controlling the critical process parameters by timely measurements. Physical process measurements and particle size data of a fluid bed granulator that are analysed in an integrated manner are included in process analytical technologies (PAT). Recent regulatory guidelines strongly encourage the pharmaceutical industry to apply scientific and risk management approaches to the development of a product and its manufacturing process. The aim of this study was to utilise PAT tools to increase the process understanding of fluid bed granulation and drying. Inlet air humidity levels and granulation liquid feed affect powder moisture during fluid bed granulation. Moisture influences on many process, granule and tablet qualities. The approach in this thesis was to identify sources of variation that are mainly related to moisture. The aim was to determine correlations and relationships, and utilise the PAT and design space concepts for the fluid bed granulation and drying. Monitoring the material behaviour in a fluidised bed has traditionally relied on the observational ability and experience of an operator. There has been a lack of good criteria for characterising material behaviour during spraying and drying phases, even though the entire performance of a process and end product quality are dependent on it. The granules were produced in an instrumented bench-scale Glatt WSG5 fluid bed granulator. The effect of inlet air humidity and granulation liquid feed on the temperature measurements at different locations of a fluid bed granulator system were determined. This revealed dynamic changes in the measurements and enabled finding the most optimal sites for process control. The moisture originating from the granulation liquid and inlet air affected the temperature of the mass and pressure difference over granules. Moreover, the effects of inlet air humidity and granulation liquid feed rate on granule size were evaluated and compensatory techniques used to optimize particle size. Various end-point indication techniques of drying were compared. The ∆T method, which is based on thermodynamic principles, eliminated the effects of humidity variations and resulted in the most precise estimation of the drying end-point. The influence of fluidisation behaviour on drying end-point detection was determined. The feasibility of the ∆T method and thus the similarities of end-point moisture contents were found to be dependent on the variation in fluidisation between manufacturing batches. A novel parameter that describes behaviour of material in a fluid bed was developed. Flow rate of the process air and turbine fan speed were used to calculate this parameter and it was compared to the fluidisation behaviour and the particle size results. The design space process trajectories for smooth fluidisation based on the fluidisation parameters were determined. With this design space it is possible to avoid excessive fluidisation and improper fluidisation and bed collapse. Furthermore, various process phenomena and failure modes were observed with the in-line particle size analyser. Both rapid increase and a decrease in granule size could be monitored in a timely manner. The fluidisation parameter and the pressure difference over filters were also discovered to express particle size when the granules had been formed. The various physical parameters evaluated in this thesis give valuable information of fluid bed process performance and increase the process understanding.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A study of the history and philosophy of the contribution of India towards the exploration of space since antiquity provides interesting insights. The contributions are described during the three periods namely: (1) the ten millenniums from 10,000 BC with a twilight period up to 900 AD; (2) the ten centuries from 900 AD to 1900 AD; and (3) the ten decades from 1900 AD to 2000 AD; called mythological, medieval, and modern respectively. Some important events during the above periods provide a reference view of the progress. The Vedas during the mythological period and the Siddhantas during the medieval periods, which are based on astronomical observations, indicate that the Indian contribution preceded other cultures. But most Western historians ignore this fact time and again in spite of many proofs provided to the contrary. This chapter also shows that Indians had the proper scientific attitude of developing any physical theory through the triplet of mind, model, and measurements. It is this same triplet that forms the basis of the present day well known Kalman filter technique. Up to about 1500 BC the Indian contribution was leading but during foreign invasion and occupation it lagged and has been improving only after independence.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Past studies use deterministic models to evaluate optimal cache configuration or to explore its design space. However, with the increasing number of components present on a chip multiprocessor (CMP), deterministic approaches do not scale well. Hence, we apply probabilistic genetic algorithms (GA) to determine a near-optimal cache configuration for a sixteen tiled CMP. We propose and implement a faster trace based approach to estimate fitness of a chromosome. It shows up-to 218x simulation speedup over the cycle-accurate architectural simulation. Our methodology can be applied to solve other cache optimization problems such as design space exploration of cache and its partitioning among applications/ virtual machines.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cyber-physical systems integrate computation, networking, and physical processes. Substantial research challenges exist in the design and verification of such large-scale, distributed sensing, ac- tuation, and control systems. Rapidly improving technology and recent advances in control theory, networked systems, and computer science give us the opportunity to drastically improve our approach to integrated flow of information and cooperative behavior. Current systems rely on text-based spec- ifications and manual design. Using new technology advances, we can create easier, more efficient, and cheaper ways of developing these control systems. This thesis will focus on design considera- tions for system topologies, ways to formally and automatically specify requirements, and methods to synthesize reactive control protocols, all within the context of an aircraft electric power system as a representative application area.

This thesis consists of three complementary parts: synthesis, specification, and design. The first section focuses on the synthesis of central and distributed reactive controllers for an aircraft elec- tric power system. This approach incorporates methodologies from computer science and control. The resulting controllers are correct by construction with respect to system requirements, which are formulated using the specification language of linear temporal logic (LTL). The second section addresses how to formally specify requirements and introduces a domain-specific language for electric power systems. A software tool automatically converts high-level requirements into LTL and synthesizes a controller.

The final sections focus on design space exploration. A design methodology is proposed that uses mixed-integer linear programming to obtain candidate topologies, which are then used to synthesize controllers. The discrete-time control logic is then verified in real-time by two methods: hardware and simulation. Finally, the problem of partial observability and dynamic state estimation is ex- plored. Given a set placement of sensors on an electric power system, measurements from these sensors can be used in conjunction with control logic to infer the state of the system.