10 resultados para Compute unified device architectures
em Massachusetts Institute of Technology
Resumo:
General-purpose computing devices allow us to (1) customize computation after fabrication and (2) conserve area by reusing expensive active circuitry for different functions in time. We define RP-space, a restricted domain of the general-purpose architectural space focussed on reconfigurable computing architectures. Two dominant features differentiate reconfigurable from special-purpose architectures and account for most of the area overhead associated with RP devices: (1) instructions which tell the device how to behave, and (2) flexible interconnect which supports task dependent dataflow between operations. We can characterize RP-space by the allocation and structure of these resources and compare the efficiencies of architectural points across broad application characteristics. Conventional FPGAs fall at one extreme end of this space and their efficiency ranges over two orders of magnitude across the space of application characteristics. Understanding RP-space and its consequences allows us to pick the best architecture for a task and to search for more robust design points in the space. Our DPGA, a fine- grained computing device which adds small, on-chip instruction memories to FPGAs is one such design point. For typical logic applications and finite- state machines, a DPGA can implement tasks in one-third the area of a traditional FPGA. TSFPGA, a variant of the DPGA which focuses on heavily time-switched interconnect, achieves circuit densities close to the DPGA, while reducing typical physical mapping times from hours to seconds. Rigid, fabrication-time organization of instruction resources significantly narrows the range of efficiency for conventional architectures. To avoid this performance brittleness, we developed MATRIX, the first architecture to defer the binding of instruction resources until run-time, allowing the application to organize resources according to its needs. Our focus MATRIX design point is based on an array of 8-bit ALU and register-file building blocks interconnected via a byte-wide network. With today's silicon, a single chip MATRIX array can deliver over 10 Gop/s (8-bit ops). On sample image processing tasks, we show that MATRIX yields 10-20x the computational density of conventional processors. Understanding the cost structure of RP-space helps us identify these intermediate architectural points and may provide useful insight more broadly in guiding our continual search for robust and efficient general-purpose computing structures.
Resumo:
Performance and manufacturability are two important issues that must be taken into account during MEMS design. Existing MEMS design models or systems follow a process-driven design paradigm, that is, design starts from the specification of process sequence or the customization of foundry-ready process template. There has been essentially no methodology or model that supports generic, high-level design synthesis for MEMS conceptual design. As a result, there lacks a basis for specifying the initial process sequences. To address this problem, this paper proposes a performance-driven, microfabrication-oriented methodology for MEMS conceptual design. A unified behaviour representation method is proposed which incorporates information of both physical interactions and chemical/biological/other reactions. Based on this method, a behavioural process based design synthesis model is proposed, which exploits multidisciplinary phenomena for design solutions, including both the structural components and their configuration for the MEMS device, as well as the necessary substances for the chemical/biological/other reactions. The model supports both forward and backward synthetic search for suitable phenomena. To ensure manufacturability, a strategy of using microfabrication-oriented phenomena as design knowledge is proposed, where the phenomena are developed from existing MEMS devices that have associated MEMS-specific microfabrication processes or foundry-ready process templates. To test the applicability of the proposed methodology, the paper also studies microfluidic device design and uses a micro-pump design for the case study.
Resumo:
The Expectation-Maximization (EM) algorithm is an iterative approach to maximum likelihood parameter estimation. Jordan and Jacobs (1993) recently proposed an EM algorithm for the mixture of experts architecture of Jacobs, Jordan, Nowlan and Hinton (1991) and the hierarchical mixture of experts architecture of Jordan and Jacobs (1992). They showed empirically that the EM algorithm for these architectures yields significantly faster convergence than gradient ascent. In the current paper we provide a theoretical analysis of this algorithm. We show that the algorithm can be regarded as a variable metric algorithm with its searching direction having a positive projection on the gradient of the log likelihood. We also analyze the convergence of the algorithm and provide an explicit expression for the convergence rate. In addition, we describe an acceleration technique that yields a significant speedup in simulation experiments.
Resumo:
We formulate and interpret several multi-modal registration methods in the context of a unified statistical and information theoretic framework. A unified interpretation clarifies the implicit assumptions of each method yielding a better understanding of their relative strengths and weaknesses. Additionally, we discuss a generative statistical model from which we derive a novel analysis tool, the "auto-information function", as a means of assessing and exploiting the common spatial dependencies inherent in multi-modal imagery. We analytically derive useful properties of the "auto-information" as well as verify them empirically on multi-modal imagery. Among the useful aspects of the "auto-information function" is that it can be computed from imaging modalities independently and it allows one to decompose the search space of registration problems.
Resumo:
I describe an approach to forming hypotheses about hidden mechanism configurations within devices given external observations and a vocabulary of primitive mechanisms. An implemented causal modelling system called JACK constructs explanations for why a second piece of toast comes out lighter, why the slide in a tire gauge does not slip back inside when the gauge is removed from the tire, and how in a refrigerator a single substance can serve as a heat sink for the interior and a heat source for the exterior. I report the number of hypotheses admitted for each device example, and provide empirical results which isolate the pruning power due to different constraint sources.
Resumo:
The constraint paradigm is a model of computation in which values are deduced whenever possible, under the limitation that deductions be local in a certain sense. One may visualize a constraint 'program' as a network of devices connected by wires. Data values may flow along the wires, and computation is performed by the devices. A device computes using only locally available information (with a few exceptions), and places newly derived values on other, locally attached wires. In this way computed values are propagated. An advantage of the constraint paradigm (not unique to it) is that a single relationship can be used in more than one direction. The connections to a device are not labelled as inputs and outputs; a device will compute with whatever values are available, and produce as many new values as it can. General theorem provers are capable of such behavior, but tend to suffer from combinatorial explosion; it is not usually useful to derive all the possible consequences of a set of hypotheses. The constraint paradigm places a certain kind of limitation on the deduction process. The limitations imposed by the constraint paradigm are not the only one possible. It is argued, however, that they are restrictive enough to forestall combinatorial explosion in many interesting computational situations, yet permissive enough to allow useful computations in practical situations. Moreover, the paradigm is intuitive: It is easy to visualize the computational effects of these particular limitations, and the paradigm is a natural way of expressing programs for certain applications, in particular relationships arising in computer-aided design. A number of implementations of constraint-based programming languages are presented. A progression of ever more powerful languages is described, complete implementations are presented and design difficulties and alternatives are discussed. The goal approached, though not quite reached, is a complete programming system which will implicitly support the constraint paradigm to the same extent that LISP, say, supports automatic storage management.
Resumo:
For many types of learners one can compute the statistically 'optimal' way to select data. We review how these techniques have been used with feedforward neural networks. We then show how the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate.
Resumo:
Regularization Networks and Support Vector Machines are techniques for solving certain problems of learning from examples -- in particular the regression problem of approximating a multivariate function from sparse data. We present both formulations in a unified framework, namely in the context of Vapnik's theory of statistical learning which provides a general foundation for the learning problem, combining functional analysis and statistics.
Resumo:
Traditionally, we've focussed on the question of how to make a system easy to code the first time, or perhaps on how to ease the system's continued evolution. But if we look at life cycle costs, then we must conclude that the important question is how to make a system easy to operate. To do this we need to make it easy for the operators to see what's going on and to then manipulate the system so that it does what it is supposed to. This is a radically different criterion for success. What makes a computer system visible and controllable? This is a difficult question, but it's clear that today's modern operating systems with nearly 50 million source lines of code are neither. Strikingly, the MIT Lisp Machine and its commercial successors provided almost the same functionality as today's mainstream sytsems, but with only 1 Million lines of code. This paper is a retrospective examination of the features of the Lisp Machine hardware and software system. Our key claim is that by building the Object Abstraction into the lowest tiers of the system, great synergy and clarity were obtained. It is our hope that this is a lesson that can impact tomorrow's designs. We also speculate on how the spirit of the Lisp Machine could be extended to include a comprehensive access control model and how new layers of abstraction could further enrich this model.
Resumo:
This paper proposes three tests to determine whether a given nonlinear device noise model is in agreement with accepted thermodynamic principles. These tests are applied to several models. One conclusion is that every Gaussian noise model for any nonlinear device predicts thermodynamically impossible circuit behavior: these models should be abandoned. But the nonlinear shot-noise model predicts thermodynamically acceptable behavior under a constraint derived here. Further, this constraint specifies the current noise amplitude at each operating point from knowledge of the device v - i curve alone. For the Gaussian and shot-noise models, this paper shows how the thermodynamic requirements can be reduced to concise mathematical tests involving no approximatio