4 resultados para Third Space
em Massachusetts Institute of Technology
Resumo:
General-purpose computing devices allow us to (1) customize computation after fabrication and (2) conserve area by reusing expensive active circuitry for different functions in time. We define RP-space, a restricted domain of the general-purpose architectural space focussed on reconfigurable computing architectures. Two dominant features differentiate reconfigurable from special-purpose architectures and account for most of the area overhead associated with RP devices: (1) instructions which tell the device how to behave, and (2) flexible interconnect which supports task dependent dataflow between operations. We can characterize RP-space by the allocation and structure of these resources and compare the efficiencies of architectural points across broad application characteristics. Conventional FPGAs fall at one extreme end of this space and their efficiency ranges over two orders of magnitude across the space of application characteristics. Understanding RP-space and its consequences allows us to pick the best architecture for a task and to search for more robust design points in the space. Our DPGA, a fine- grained computing device which adds small, on-chip instruction memories to FPGAs is one such design point. For typical logic applications and finite- state machines, a DPGA can implement tasks in one-third the area of a traditional FPGA. TSFPGA, a variant of the DPGA which focuses on heavily time-switched interconnect, achieves circuit densities close to the DPGA, while reducing typical physical mapping times from hours to seconds. Rigid, fabrication-time organization of instruction resources significantly narrows the range of efficiency for conventional architectures. To avoid this performance brittleness, we developed MATRIX, the first architecture to defer the binding of instruction resources until run-time, allowing the application to organize resources according to its needs. Our focus MATRIX design point is based on an array of 8-bit ALU and register-file building blocks interconnected via a byte-wide network. With today's silicon, a single chip MATRIX array can deliver over 10 Gop/s (8-bit ops). On sample image processing tasks, we show that MATRIX yields 10-20x the computational density of conventional processors. Understanding the cost structure of RP-space helps us identify these intermediate architectural points and may provide useful insight more broadly in guiding our continual search for robust and efficient general-purpose computing structures.
Resumo:
The Space Systems, Policy and Architecture Research Consortium (SSPARC) was formed to make substantial progress on problems of national importance. The goals of SSPARC were to: • Provide technologies and methods that will allow the creation of flexible, upgradable space systems, • Create a “clean sheet” approach to space systems architecture determination and design, including the incorporation of risk, uncertainty, and flexibility issues, and • Consider the impact of national space policy on the above. This report covers the last two goals, and demonstrates that the effort was largely successful.
Resumo:
Caches are known to consume up to half of all system power in embedded processors. Co-optimizing performance and power of the cache subsystems is therefore an important step in the design of embedded systems, especially those employing application specific instruction processors. In this project, we propose an analytical cache model that succinctly captures the miss performance of an application over the entire cache parameter space. Unlike exhaustive trace driven simulation, our model requires that the program be simulated once so that a few key characteristics can be obtained. Using these application-dependent characteristics, the model can span the entire cache parameter space consisting of cache sizes, associativity and cache block sizes. In our unified model, we are able to cater for direct-mapped, set and fully associative instruction, data and unified caches. Validation against full trace-driven simulations shows that our model has a high degree of fidelity. Finally, we show how the model can be coupled with a power model for caches such that one can very quickly decide on pareto-optimal performance-power design points for rapid design space exploration.
Resumo:
We present a technique for the rapid and reliable evaluation of linear-functional output of elliptic partial differential equations with affine parameter dependence. The essential components are (i) rapidly uniformly convergent reduced-basis approximations — Galerkin projection onto a space WN spanned by solutions of the governing partial differential equation at N (optimally) selected points in parameter space; (ii) a posteriori error estimation — relaxations of the residual equation that provide inexpensive yet sharp and rigorous bounds for the error in the outputs; and (iii) offline/online computational procedures — stratagems that exploit affine parameter dependence to de-couple the generation and projection stages of the approximation process. The operation count for the online stage — in which, given a new parameter value, we calculate the output and associated error bound — depends only on N (typically small) and the parametric complexity of the problem. The method is thus ideally suited to the many-query and real-time contexts. In this paper, based on the technique we develop a robust inverse computational method for very fast solution of inverse problems characterized by parametrized partial differential equations. The essential ideas are in three-fold: first, we apply the technique to the forward problem for the rapid certified evaluation of PDE input-output relations and associated rigorous error bounds; second, we incorporate the reduced-basis approximation and error bounds into the inverse problem formulation; and third, rather than regularize the goodness-of-fit objective, we may instead identify all (or almost all, in the probabilistic sense) system configurations consistent with the available experimental data — well-posedness is reflected in a bounded "possibility region" that furthermore shrinks as the experimental error is decreased.