9 resultados para Hardware Implementation
em Massachusetts Institute of Technology
Resumo:
The furious pace of Moore's Law is driving computer architecture into a realm where the the speed of light is the dominant factor in system latencies. The number of clock cycles to span a chip are increasing, while the number of bits that can be accessed within a clock cycle is decreasing. Hence, it is becoming more difficult to hide latency. One alternative solution is to reduce latency by migrating threads and data, but the overhead of existing implementations has previously made migration an unserviceable solution so far. I present an architecture, implementation, and mechanisms that reduces the overhead of migration to the point where migration is a viable supplement to other latency hiding mechanisms, such as multithreading. The architecture is abstract, and presents programmers with a simple, uniform fine-grained multithreaded parallel programming model with implicit memory management. In other words, the spatial nature and implementation details (such as the number of processors) of a parallel machine are entirely hidden from the programmer. Compiler writers are encouraged to devise programming languages for the machine that guide a programmer to express their ideas in terms of objects, since objects exhibit an inherent physical locality of data and code. The machine implementation can then leverage this locality to automatically distribute data and threads across the physical machine by using a set of high performance migration mechanisms. An implementation of this architecture could migrate a null thread in 66 cycles -- over a factor of 1000 improvement over previous work. Performance also scales well; the time required to move a typical thread is only 4 to 5 times that of a null thread. Data migration performance is similar, and scales linearly with data block size. Since the performance of the migration mechanism is on par with that of an L2 cache, the implementation simulated in my work has no data caches and relies instead on multithreading and the migration mechanism to hide and reduce access latencies.
Resumo:
The Scheme86 and the HP Precision Architectures represent different trends in computer processor design. The former uses wide micro-instructions, parallel hardware, and a low latency memory interface. The latter encourages pipelined implementation and visible interlocks. To compare the merits of these approaches, algorithms frequently encountered in numerical and symbolic computation were hand-coded for each architecture. Timings were done in simulators and the results were evaluated to determine the speed of each design. Based on these measurements, conclusions were drawn as to which aspects of each architecture are suitable for a high- performance computer.
Resumo:
This robot has low natural frequencies of vibration. Insights into the problems of designing joint and link flexibility are discussed. The robot has three flexible rotary actuators and two flexible, interchangeable links, and is controlled by three independent processors on a VMEbus. Results from experiments on the control of residual vibration for different types of robot motion are presented. Impulse prefiltering and slowly accelerating moves are compared and shown to be effective at reducing residual vibration.
Resumo:
This thesis addresses the problem of developing automatic grasping capabilities for robotic hands. Using a 2-jointed and a 4-jointed nmodel of the hand, we establish the geometric conditions necessary for achieving form closure grasps of cylindrical objects. We then define and show how to construct the grasping pre-image for quasi-static (friction dominated) and zero-G (inertia dominated) motions for sensorless and sensor-driven grasps with and without arm motions. While the approach does not rely on detailed modeling, it is computationally inexpensive, reliable, and easy to implement. Example behaviors were successfully implemented on the Salisbury hand and on a planar 2-fingered, 4 degree-of-freedom hand.
Resumo:
As exploration of our solar system and outerspace move into the future, spacecraft are being developed to venture on increasingly challenging missions with bold objectives. The spacecraft tasked with completing these missions are becoming progressively more complex. This increases the potential for mission failure due to hardware malfunctions and unexpected spacecraft behavior. A solution to this problem lies in the development of an advanced fault management system. Fault management enables spacecraft to respond to failures and take repair actions so that it may continue its mission. The two main approaches developed for spacecraft fault management have been rule-based and model-based systems. Rules map sensor information to system behaviors, thus achieving fast response times, and making the actions of the fault management system explicit. These rules are developed by having a human reason through the interactions between spacecraft components. This process is limited by the number of interactions a human can reason about correctly. In the model-based approach, the human provides component models, and the fault management system reasons automatically about system wide interactions and complex fault combinations. This approach improves correctness, and makes explicit the underlying system models, whereas these are implicit in the rule-based approach. We propose a fault detection engine, Compiled Mode Estimation (CME) that unifies the strengths of the rule-based and model-based approaches. CME uses a compiled model to determine spacecraft behavior more accurately. Reasoning related to fault detection is compiled in an off-line process into a set of concurrent, localized diagnostic rules. These are then combined on-line along with sensor information to reconstruct the diagnosis of the system. These rules enable a human to inspect the diagnostic consequences of CME. Additionally, CME is capable of reasoning through component interactions automatically and still provide fast and correct responses. The implementation of this engine has been tested against the NEAR spacecraft advanced rule-based system, resulting in detection of failures beyond that of the rules. This evolution in fault detection will enable future missions to explore the furthest reaches of the solar system without the burden of human intervention to repair failed components.
Resumo:
This report is a formal documentation of the results of an assessment of the degree to which Lean Principles and Practices have been implemented in the US Aerospace and Defense Industry. An Industry Association team prepared it for the DCMA-DCAAIndustry Association “Crosstalk” Coalition in response to a “Crosstalk” meeting action request to the industry associations. The motivation of this request was provided by the many potential benefits to system product quality, affordability and industry responsiveness, which a high degree of industry Lean implementation can produce.
Resumo:
Since the rise of the industrial revolution, there are few challenges that compare in scale and scope with the challenge of implementing lean principles in order to achieve high performance work systems. This report summarize key insights and learning by representatives from a cross section of organizations who are on this journey. Specifically, we report on findings from the first Lean Aircraft Initiative (LAI) Implementation Workshop, which was held on February 5-6, 1997.
Resumo:
The memory hierarchy is the main bottleneck in modern computer systems as the gap between the speed of the processor and the memory continues to grow larger. The situation in embedded systems is even worse. The memory hierarchy consumes a large amount of chip area and energy, which are precious resources in embedded systems. Moreover, embedded systems have multiple design objectives such as performance, energy consumption, and area, etc. Customizing the memory hierarchy for specific applications is a very important way to take full advantage of limited resources to maximize the performance. However, the traditional custom memory hierarchy design methodologies are phase-ordered. They separate the application optimization from the memory hierarchy architecture design, which tend to result in local-optimal solutions. In traditional Hardware-Software co-design methodologies, much of the work has focused on utilizing reconfigurable logic to partition the computation. However, utilizing reconfigurable logic to perform the memory hierarchy design is seldom addressed. In this paper, we propose a new framework for designing memory hierarchy for embedded systems. The framework will take advantage of the flexible reconfigurable logic to customize the memory hierarchy for specific applications. It combines the application optimization and memory hierarchy design together to obtain a global-optimal solution. Using the framework, we performed a case study to design a new software-controlled instruction memory that showed promising potential.
Resumo:
Since the rise of the industrial revolution, there are few challenges that compare in scale and scope with the challenge of implementing lean principles in order to achieve high performance work systems. This report summarize key insights and learning by representatives from a cross section of organizations who are on this journey. Specifically, we report on findings from the first Lean Aircraft Initiative (LAI) Implementation Workshop, which was held on February 5-6, 1997. The report is not a “cookbook” or a “how to” manual. Rather, it is a summary of the first phase in a learning process. It is designed to codify lessons learning, facilitate diffusion among people not at the session, and set the stage for further learning about implementation.