15 resultados para VLSI CAD
em Massachusetts Institute of Technology
Resumo:
For applications involving the control of moving vehicles, the recovery of relative motion between a camera and its environment is of high utility. This thesis describes the design and testing of a real-time analog VLSI chip which estimates the focus of expansion (FOE) from measured time-varying images. Our approach assumes a camera moving through a fixed world with translational velocity; the FOE is the projection of the translation vector onto the image plane. This location is the point towards which the camera is moving, and other points appear to be expanding outward from. By way of the camera imaging parameters, the location of the FOE gives the direction of 3-D translation. The algorithm we use for estimating the FOE minimizes the sum of squares of the differences at every pixel between the observed time variation of brightness and the predicted variation given the assumed position of the FOE. This minimization is not straightforward, because the relationship between the brightness derivatives depends on the unknown distance to the surface being imaged. However, image points where brightness is instantaneously constant play a critical role. Ideally, the FOE would be at the intersection of the tangents to the iso-brightness contours at these "stationary" points. In practice, brightness derivatives are hard to estimate accurately given that the image is quite noisy. Reliable results can nevertheless be obtained if the image contains many stationary points and the point is found that minimizes the sum of squares of the perpendicular distances from the tangents at the stationary points. The FOE chip calculates the gradient of this least-squares minimization sum, and the estimation is performed by closing a feedback loop around it. The chip has been implemented using an embedded CCD imager for image acquisition and a row-parallel processing scheme. A 64 x 64 version was fabricated in a 2um CCD/ BiCMOS process through MOSIS with a design goal of 200 mW of on-chip power, a top frame rate of 1000 frames/second, and a basic accuracy of 5%. A complete experimental system which estimates the FOE in real time using real motion and image scenes is demonstrated.
Resumo:
This thesis examines a complete design framework for a real-time, autonomous system with specialized VLSI hardware for computing 3-D camera motion. In the proposed architecture, the first step is to determine point correspondences between two images. Two processors, a CCD array edge detector and a mixed analog/digital binary block correlator, are proposed for this task. The report is divided into three parts. Part I covers the algorithmic analysis; part II describes the design and test of a 32$\time $32 CCD edge detector fabricated through MOSIS; and part III compares the design of the mixed analog/digital correlator to a fully digital implementation.
Resumo:
The Message-Driven Processor is a node of a large-scale multiprocessor being developed by the Concurrent VLSI Architecture Group. It is intended to support fine-grained, message passing, parallel computation. It contains several novel architectural features, such as a low-latency network interface, extensive type-checking hardware, and on-chip memory that can be used as an associative lookup table. This document is a programmer's guide to the MDP. It describes the processor's register architecture, instruction set, and the data types supported by the processor. It also details the MDP's message sending and exception handling facilities.
Resumo:
The amount of computation required to solve many early vision problems is prodigious, and so it has long been thought that systems that operate in a reasonable amount of time will only become feasible when parallel systems become available. Such systems now exist in digital form, but most are large and expensive. These machines constitute an invaluable test-bed for the development of new algorithms, but they can probably not be scaled down rapidly in both physical size and cost, despite continued advances in semiconductor technology and machine architecture. Simple analog networks can perform interesting computations, as has been known for a long time. We have reached the point where it is feasible to experiment with implementation of these ideas in VLSI form, particularly if we focus on networks composed of locally interconnected passive elements, linear amplifiers, and simple nonlinear components. While there have been excursions into the development of ideas in this area since the very beginnings of work on machine vision, much work remains to be done. Progress will depend on careful attention to matching of the capabilities of simple networks to the needs of early vision. Note that this is not at all intended to be anything like a review of the field, but merely a collection of some ideas that seem to be interesting.
Resumo:
Early and intermediate vision algorithms, such as smoothing and discontinuity detection, are often implemented on general-purpose serial, and more recently, parallel computers. Special-purpose hardware implementations of low-level vision algorithms may be needed to achieve real-time processing. This memo reviews and analyzes some hardware implementations of low-level vision algorithms. Two types of hardware implementations are considered: the digital signal processing chips of Ruetz (and Broderson) and the analog VLSI circuits of Carver Mead. The advantages and disadvantages of these two approaches for producing a general, real-time vision system are considered.
Resumo:
How can one compute qualitative properties of the optical flow, such as expansion or rotation, in a way which is robust and invariant to the position of the focus of expansion or the center of rotation? We suggest a particularly simple algorithm, well-suited to VLSI implementations, that exploits well-known relations between the integral and differential properties of vector fields and their linear behaviour near singularities.
Resumo:
In the first part of this paper we show that a new technique exploiting 1D correlation of 2D or even 1D patches between successive frames may be sufficient to compute a satisfactory estimation of the optical flow field. The algorithm is well-suited to VLSI implementations. The sparse measurements provided by the technique can be used to compute qualitative properties of the flow for a number of different visual tsks. In particular, the second part of the paper shows how to combine our 1D correlation technique with a scheme for detecting expansion or rotation ([5]) in a simple algorithm which also suggests interesting biological implications. The algorithm provides a rough estimate of time-to-crash. It was tested on real image sequences. We show its performance and compare the results to previous approaches.
Resumo:
This thesis describes the design and implementation of an integrated circuit and associated packaging to be used as the building block for the data routing network of a large scale shared memory multiprocessor system. A general purpose multiprocessor depends on high-bandwidth, low-latency communications between computing elements. This thesis describes the design and construction of RN1, a novel self-routing, enhanced crossbar switch as a CMOS VLSI chip. This chip provides the basic building block for a scalable pipelined routing network with byte-wide data channels. A series of RN1 chips can be cascaded with no additional internal network components to form a multistage fault-tolerant routing switch. The chip is designed to operate at clock frequencies up to 100Mhz using Hewlett-Packard's HP34 $1.2\\mu$ process. This aggressive performance goal demands that special attention be paid to optimization of the logic architecture and circuit design.
Resumo:
This thesis describes two programs for generating tests for digital circuits that exploit several kinds of expert knowledge not used by previous approaches. First, many test generation problems can be solved efficiently using operation relations, a novel representation of circuit behavior that connects internal component operations with directly executable circuit operations. Operation relations can be computed efficiently by searching traces of simulated circuit behavior. Second, experts write test programs rather than test vectors because programs are more readable and compact. Test programs can be constructed automatically by merging program fragments using expert-supplied goal-refinement rules and domain-independent planning techniques.
Resumo:
This thesis presents a new approach to building a design for testability (DFT) system. The system takes a digital circuit description, finds out the problems in testing it, and suggests circuit modifications to correct those problems. The key contributions of the thesis research are (1) setting design for testability in the context of test generation (TG), (2) using failures during FG to focus on testability problems, and (3) relating circuit modifications directly to the failures. A natural functionality set is used to represent the maximum functionalities that a component can have. The current implementation has only primitive domain knowledge and needs other work as well. However, armed with the knowledge of TG, it has already demonstrated its ability and produced some interesting results on a simple microprocessor.
Resumo:
This paper addresses the problem of efficiently computing the motor torques required to drive a lower-pair kinematic chain (e.g., a typical manipulator arm in free motion, or a mechanical leg in the swing phase) given the desired trajectory; i.e., the Inverse Dynamics problem. It investigates the high degree of parallelism inherent in the computations, and presents two "mathematically exact" formulations especially suited to high-speed, highly parallel implementations using special-purpose hardware or VLSI devices. In principle, the formulations should permit the calculations to run at a speed bounded only by I/O. The first presented is a parallel version of the recent linear Newton-Euler recursive algorithm. The time cost is also linear in the number of joints, but the real-time coefficients are reduced by almost two orders of magnitude. The second formulation reports a new parallel algorithm which shows that it is possible to improve upon the linear time dependency. The real time required to perform the calculations increases only as the [log2] of the number of joints. Either formulation is susceptible to a systolic pipelined architecture in which complete sets of joint torques emerge at successive intervals of four floating-point operations. Hardware requirements necessary to support the algorithm are considered and found not to be excessive, and a VLSI implementation architecture is suggested. We indicate possible applications to incorporating dynamical considerations into trajectory planning, e.g. it may be possible to build an on-line trajectory optimizer.
Resumo:
I have designed and implemented a system for the multilevel verification of synchronous MOS VLSI circuits. The system, called Silica Pithecus, accepts the schematic of an MOS circuit and a specification of the circuit's intended digital behavior. Silica Pithecus determines if the circuit meets its specification. If the circuit fails to meet its specification Silica Pithecus returns to the designer the reason for the failure. Unlike earlier verifiers which modelled primitives (e.g., transistors) as unidirectional digital devices, Silica Pithecus models primitives more realistically. Transistors are modelled as bidirectional devices of varying resistances, and nodes are modelled as capacitors. Silica Pithecus operates hierarchically, interactively, and incrementally. Major contributions of this research include a formal understanding of the relationship between different behavioral descriptions (e.g., signal, boolean, and arithmetic descriptions) of the same device, and a formalization of the relationship between the structure, behavior, and context of device. Given these formal structures my methods find sufficient conditions on the inputs of circuits which guarantee the correct operation of the circuit in the desired descriptive domain. These methods are algorithmic and complete. They also handle complex phenomena such as races and charge sharing. Informal notions such as races and hazards are shown to be derivable from the correctness conditions used by my methods.
Resumo:
This thesis describes Optimist, an optimizing compiler for the Concurrent Smalltalk language developed by the Concurrent VLSI Architecture Group. Optimist compiles Concurrent Smalltalk to the assembly language of the Message-Driven Processor (MDP). The compiler includes numerous optimization techniques such as dead code elimination, dataflow analysis, constant folding, move elimination, concurrency analysis, duplicate code merging, tail forwarding, use of register variables, as well as various MDP-specific optimizations in the code generator. The MDP presents some unique challenges and opportunities for compilation. Due to the MDP's small memory size, it is critical that the size of the generated code be as small as possible. The MDP is an inherently concurrent processor with efficient mechanisms for sending and receiving messages; the compiler takes advantage of these mechanisms. The MDP's tagged architecture allows very efficient support of object-oriented languages such as Concurrent Smalltalk. The initial goals for the MDP were to have the MDP execute about twenty instructions per method and contain 4096 words of memory. This compiler shows that these goals are too optimistic -- most methods are longer, both in terms of code size and running time. Thus, the memory size of the MDP should be increased.
Resumo:
By enhancing a real scene with computer generated objects, Augmented Reality (AR), has proven itself as a valuable Human-Computer Interface (HCI) in numerous application areas such as medical, military, entertainment and manufacturing. It enables higher performance of on-site tasks with seamless presentation of up-to-date, task-related information to the users during the operation. AR has potentials in design because the current interface provided by Computer-aided Design (CAD) packages is less intuitive and reports show that the presence of physical objects help design thinking and communication. This research explores the use of AR to improve the efficiency of a design process, specifically in mechanical design.
Resumo:
We contribute a quantitative and systematic model to capture etch non-uniformity in deep reactive ion etch of microelectromechanical systems (MEMS) devices. Deep reactive ion etch is commonly used in MEMS fabrication where high-aspect ratio features are to be produced in silicon. It is typical for many supposedly identical devices, perhaps of diameter 10 mm, to be etched simultaneously into one silicon wafer of diameter 150 mm. Etch non-uniformity depends on uneven distributions of ion and neutral species at the wafer level, and on local consumption of those species at the device, or die, level. An ion–neutral synergism model is constructed from data obtained from etching several layouts of differing pattern opening densities. Such a model is used to predict wafer-level variation with an r.m.s. error below 3%. This model is combined with a die-level model, which we have reported previously, on a MEMS layout. The two-level model is shown to enable prediction of both within-die and wafer-scale etch rate variation for arbitrary wafer loadings.