9 resultados para Parallel computers - Design and construction
em Massachusetts Institute of Technology
Resumo:
This paper presents the research and development of a 3-legged micro Parallel Kinematic Manipulator (PKM) for positioning in micro-machining and assembly operations. The structural characteristics associated with parallel manipulators are evaluated and the PKMs with translational and rotational movements are identified. Based on these identifications, a hybrid 3-UPU (Universal Joint-Prismatic Joint-Universal Joint) parallel manipulator is designed and fabricated. The principles of the operation and modeling of this micro PKM is largely similar to a normal size Stewart Platform (SP). A modular design methodology is introduced for the construction of this micro PKM. Calibration results of this hybrid 3-UPU PKM are discussed in this paper.
Resumo:
Parallel shared-memory machines with hundreds or thousands of processor-memory nodes have been built; in the future we will see machines with millions or even billions of nodes. Associated with such large systems is a new set of design challenges. Many problems must be addressed by an architecture in order for it to be successful; of these, we focus on three in particular. First, a scalable memory system is required. Second, the network messaging protocol must be fault-tolerant. Third, the overheads of thread creation, thread management and synchronization must be extremely low. This thesis presents the complete system design for Hamal, a shared-memory architecture which addresses these concerns and is directly scalable to one million nodes. Virtual memory and distributed objects are implemented in a manner that requires neither inter-node synchronization nor the storage of globally coherent translations at each node. We develop a lightweight fault-tolerant messaging protocol that guarantees message delivery and idempotence across a discarding network. A number of hardware mechanisms provide efficient support for massive multithreading and fine-grained synchronization. Experiments are conducted in simulation, using a trace-driven network simulator to investigate the messaging protocol and a cycle-accurate simulator to evaluate the Hamal architecture. We determine implementation parameters for the messaging protocol which optimize performance. A discarding network is easier to design and can be clocked at a higher rate, and we find that with this protocol its performance can approach that of a non-discarding network. Our simulations of Hamal demonstrate the effectiveness of its thread management and synchronization primitives. In particular, we find register-based synchronization to be an extremely efficient mechanism which can be used to implement a software barrier with a latency of only 523 cycles on a 512 node machine.
Resumo:
A revolution\0\0\0 in earthmoving, a $100 billion industry, can be achieved with three components: the GPS location system, sensors and computers in bulldozers, and SITE CONTROLLER, a central computer system that maintains design data and directs operations. The first two components are widely available; I built SITE CONTROLLER to complete the triangle and describe it here. SITE CONTROLLER assists civil engineers in the design, estimation, and construction of earthworks, including hazardous waste site remediation. The core of SITE CONTROLLER is a site modelling system that represents existing and prospective terrain shapes, roads, hydrology, etc. Around this core are analysis, simulation, and vehicle control tools. Integrating these modules into one program enables civil engineers and contractors to use a single interface and database throughout the life of a project.
Resumo:
This technical report describes a new protocol, the Unique Token Protocol, for reliable message communication. This protocol eliminates the need for end-to-end acknowledgments and minimizes the communication effort when no dynamic errors occur. Various properties of end-to-end protocols are presented. The unique token protocol solves the associated problems. It eliminates source buffering by maintaining in the network at least two copies of a message. A token is used to decide if a message was delivered to the destination exactly once. This technical report also presents a possible implementation of the protocol in a worm-hole routed, 3-D mesh network.
Resumo:
This thesis describes the design and implementation of an integrated circuit and associated packaging to be used as the building block for the data routing network of a large scale shared memory multiprocessor system. A general purpose multiprocessor depends on high-bandwidth, low-latency communications between computing elements. This thesis describes the design and construction of RN1, a novel self-routing, enhanced crossbar switch as a CMOS VLSI chip. This chip provides the basic building block for a scalable pipelined routing network with byte-wide data channels. A series of RN1 chips can be cascaded with no additional internal network components to form a multistage fault-tolerant routing switch. The chip is designed to operate at clock frequencies up to 100Mhz using Hewlett-Packard's HP34 $1.2\\mu$ process. This aggressive performance goal demands that special attention be paid to optimization of the logic architecture and circuit design.
Resumo:
Act2 is a highly concurrent programming language designed to exploit the processing power available from parallel computer architectures. The language supports advanced concepts in software engineering, providing high-level constructs suitable for implementing artificially-intelligent applications. Act2 is based on the Actor model of computation, consisting of virtual computational agents which communicate by message-passing. Act2 serves as a framework in which to integrate an actor language, a description and reasoning system, and a problem-solving and resource management system. This document describes issues in Act2's design and the implementation of an interpreter for the language.
Resumo:
This thesis presents SodaBot, a general-purpose software agent user-environment and construction system. Its primary component is the basic software agent --- a computational framework for building agents which is essentially an agent operating system. We also present a new language for programming the basic software agent whose primitives are designed around human-level descriptions of agent activity. Via this programming language, users can easily implement a wide-range of typical software agent applications, e.g. personal on-line assistants and meeting scheduling agents. The SodaBot system has been implemented and tested, and its description comprises the bulk of this thesis.
Resumo:
Previous research in force control has focused on the choice of appropriate servo implementation without corresponding regard to the choice of mechanical hardware. This report analyzes the effect of mechanical properties such as contact compliance, actuator-to-joint compliance, torque ripple, and highly nonlinear dry friction in the transmission mechanisms of a manipulator. A set of requisites for high performance then guides the development of mechanical-design and servo strategies for improved performance. A single-degree-of-freedom transmission testbed was constructed that confirms the predicted effect of Coulomb friction on robustness; design and construction of a cable-driven, four-degree-of- freedom, "whole-arm" manipulator illustrates the recommended design strategies.
Resumo:
Conventional parallel computer architectures do not provide support for non-uniformly distributed objects. In this thesis, I introduce sparsely faceted arrays (SFAs), a new low-level mechanism for naming regions of memory, or facets, on different processors in a distributed, shared memory parallel processing system. Sparsely faceted arrays address the disconnect between the global distributed arrays provided by conventional architectures (e.g. the Cray T3 series), and the requirements of high-level parallel programming methods that wish to use objects that are distributed over only a subset of processing elements. A sparsely faceted array names a virtual globally-distributed array, but actual facets are lazily allocated. By providing simple semantics and making efficient use of memory, SFAs enable efficient implementation of a variety of non-uniformly distributed data structures and related algorithms. I present example applications which use SFAs, and describe and evaluate simple hardware mechanisms for implementing SFAs. Keeping track of which nodes have allocated facets for a particular SFA is an important task that suggests the need for automatic memory management, including garbage collection. To address this need, I first argue that conventional tracing techniques such as mark/sweep and copying GC are inherently unscalable in parallel systems. I then present a parallel memory-management strategy, based on reference-counting, that is capable of garbage collecting sparsely faceted arrays. I also discuss opportunities for hardware support of this garbage collection strategy. I have implemented a high-level hardware/OS simulator featuring hardware support for sparsely faceted arrays and automatic garbage collection. I describe the simulator and outline a few of the numerous details associated with a "real" implementation of SFAs and SFA-aware garbage collection. Simulation results are used throughout this thesis in the evaluation of hardware support mechanisms.