792 resultados para Processor architecture


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Advances in computer memory technology justify research towards new and different views on computer organization. This paper proposes a novel memory-centric computing architecture with the goal to merge memory and processing elements in order to provide better conditions for parallelization and performance. The paper introduces the architectural concepts and afterwards shows the design and implementation of a corresponding assembler and simulator.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Multi-core processors is a design philosophy that has become mainstream in scientific and engineering applications. Increasing performance and gate capacity of recent FPGA devices has permitted complex logic systems to be implemented on a single programmable device. By using VHDL here we present an implementation of one multi-core processor by using the PLASMA IP core based on the (most) MIPS I ISA and give an overview of the processor architecture and share theexecution results.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper presents SiMR, a simulator of the Rudimentary Machine designed to be used in a first course of computer architecture of Software Engineering and Computer Engineering programmes. The Rudimentary Machine contains all the basic elements in a RISC computer, and SiMR allows editing, assembling and executing programmes for this processor. SiMR is used at the Universitat Oberta de Catalunya as one of the most important resources in the Virtual Computing Architecture and Organisation Laboratory, since students work at home with the simulator and reports containing their work are automatically generated to be evaluated by lecturers. The results obtained from a survey show that most of the students consider SiMR as a highly necessary or even an indispensable resource to learn the basic concepts about computer architecture.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This thesis deals with a hardware accelerated Java virtual machine, named REALJava. The REALJava virtual machine is targeted for resource constrained embedded systems. The goal is to attain increased computational performance with reduced power consumption. While these objectives are often seen as trade-offs, in this context both of them can be attained simultaneously by using dedicated hardware. The target level of the computational performance of the REALJava virtual machine is initially set to be as fast as the currently available full custom ASIC Java processors. As a secondary goal all of the components of the virtual machine are designed so that the resulting system can be scaled to support multiple co-processor cores. The virtual machine is designed using the hardware/software co-design paradigm. The partitioning between the two domains is flexible, allowing customizations to the resulting system, for instance the floating point support can be omitted from the hardware in order to decrease the size of the co-processor core. The communication between the hardware and the software domains is encapsulated into modules. This allows the REALJava virtual machine to be easily integrated into any system, simply by redesigning the communication modules. Besides the virtual machine and the related co-processor architecture, several performance enhancing techniques are presented. These include techniques related to instruction folding, stack handling, method invocation, constant loading and control in time domain. The REALJava virtual machine is prototyped using three different FPGA platforms. The original pipeline structure is modified to suit the FPGA environment. The performance of the resulting Java virtual machine is evaluated against existing Java solutions in the embedded systems field. The results show that the goals are attained, both in terms of computational performance and power consumption. Especially the computational performance is evaluated thoroughly, and the results show that the REALJava is more than twice as fast as the fastest full custom ASIC Java processor. In addition to standard Java virtual machine benchmarks, several new Java applications are designed to both verify the results and broaden the spectrum of the tests.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this and a preceding paper, we provide an introduction to the Fujitsu VPP range of vector-parallel supercomputers and to some of the computational chemistry software available for the VPP. Here, we consider the implementation and performance of seven popular chemistry application packages. The codes discussed range from classical molecular dynamics to semiempirical and ab initio quantum chemistry. All have evolved from sequential codes, and have typically been parallelised using a replicated data approach. As such they are well suited to the large-memory/fast-processor architecture of the VPP. For one code, CASTEP, a distributed-memory data-driven parallelisation scheme is presented. (C) 2000 Published by Elsevier Science B.V. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

En termes de temps d'execució i ús de dades, les aplicacions paral·leles/distribuïdes poden tenir execucions variables, fins i tot quan s'empra el mateix conjunt de dades d'entrada. Existeixen certs aspectes de rendiment relacionats amb l'entorn que poden afectar dinàmicament el comportament de l'aplicació, tals com: la capacitat de la memòria, latència de la xarxa, el nombre de nodes, l'heterogeneïtat dels nodes, entre d'altres. És important considerar que l'aplicació pot executar-se en diferents configuracions de maquinari i el desenvolupador d'aplicacions no port garantir que els ajustaments de rendiment per a un sistema en particular continuïn essent vàlids per a d'altres configuracions. L'anàlisi dinàmica de les aplicacions ha demostrat ser el millor enfocament per a l'anàlisi del rendiment per dues raons principals. En primer lloc, ofereix una solució molt còmoda des del punt de vista dels desenvolupadors mentre que aquests dissenyen i evaluen les seves aplicacions paral·leles. En segon lloc, perquè s'adapta millor a l'aplicació durant l'execució. Aquest enfocament no requereix la intervenció de desenvolupadors o fins i tot l'accés al codi font de l'aplicació. S'analitza l'aplicació en temps real d'execució i es considra i analitza la recerca dels possibles colls d'ampolla i optimitzacions. Per a optimitzar l'execució de l'aplicació bioinformàtica mpiBLAST, vam analitzar el seu comportament per a identificar els paràmetres que intervenen en el rendiment d'ella, com ara: l'ús de la memòria, l'ús de la xarxa, patrons d'E/S, el sistema de fitxers emprat, l'arquitectura del processador, la grandària de la base de dades biològica, la grandària de la seqüència de consulta, la distribució de les seqüències dintre d'elles, el nombre de fragments de la base de dades i/o la granularitat dels treballs assignats a cada procés. El nostre objectiu és determinar quins d'aquests paràmetres tenen major impacte en el rendiment de les aplicacions i com ajustar-los dinàmicament per a millorar el rendiment de l'aplicació. Analitzant el rendiment de l'aplicació mpiBLAST hem trobat un conjunt de dades que identifiquen cert nivell de serial·lització dintre l'execució. Reconeixent l'impacte de la caracterització de les seqüències dintre de les diferents bases de dades i una relació entre la capacitat dels workers i la granularitat de la càrrega de treball actual, aquestes podrien ser sintonitzades dinàmicament. Altres millores també inclouen optimitzacions relacionades amb el sistema de fitxers paral·lel i la possibilitat d'execució en múltiples multinucli. La grandària de gra de treball està influenciat per factors com el tipus de base de dades, la grandària de la base de dades, i la relació entre grandària de la càrrega de treball i la capacitat dels treballadors.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The motivation for this research initiated from the abrupt rise and fall of minicomputers which were initially used both for industrial automation and business applications due to their significantly lower cost than their predecessors, the mainframes. Later industrial automation developed its own vertically integrated hardware and software to address the application needs of uninterrupted operations, real-time control and resilience to harsh environmental conditions. This has led to the creation of an independent industry, namely industrial automation used in PLC, DCS, SCADA and robot control systems. This industry employs today over 200'000 people in a profitable slow clockspeed context in contrast to the two mainstream computing industries of information technology (IT) focused on business applications and telecommunications focused on communications networks and hand-held devices. Already in 1990s it was foreseen that IT and communication would merge into one Information and communication industry (ICT). The fundamental question of the thesis is: Could industrial automation leverage a common technology platform with the newly formed ICT industry? Computer systems dominated by complex instruction set computers (CISC) were challenged during 1990s with higher performance reduced instruction set computers (RISC). RISC started to evolve parallel to the constant advancement of Moore's law. These developments created the high performance and low energy consumption System-on-Chip architecture (SoC). Unlike to the CISC processors RISC processor architecture is a separate industry from the RISC chip manufacturing industry. It also has several hardware independent software platforms consisting of integrated operating system, development environment, user interface and application market which enables customers to have more choices due to hardware independent real time capable software applications. An architecture disruption merged and the smartphone and tablet market were formed with new rules and new key players in the ICT industry. Today there are more RISC computer systems running Linux (or other Unix variants) than any other computer system. The astonishing rise of SoC based technologies and related software platforms in smartphones created in unit terms the largest installed base ever seen in the history of computers and is now being further extended by tablets. An underlying additional element of this transition is the increasing role of open source technologies both in software and hardware. This has driven the microprocessor based personal computer industry with few dominating closed operating system platforms into a steep decline. A significant factor in this process has been the separation of processor architecture and processor chip production and operating systems and application development platforms merger into integrated software platforms with proprietary application markets. Furthermore the pay-by-click marketing has changed the way applications development is compensated: Three essays on major trends in a slow clockspeed industry: The case of industrial automation 2014 freeware, ad based or licensed - all at a lower price and used by a wider customer base than ever before. Moreover, the concept of software maintenance contract is very remote in the app world. However, as a slow clockspeed industry, industrial automation has remained intact during the disruptions based on SoC and related software platforms in the ICT industries. Industrial automation incumbents continue to supply systems based on vertically integrated systems consisting of proprietary software and proprietary mainly microprocessor based hardware. They enjoy admirable profitability levels on a very narrow customer base due to strong technology-enabled customer lock-in and customers' high risk leverage as their production is dependent on fault-free operation of the industrial automation systems. When will this balance of power be disrupted? The thesis suggests how industrial automation could join the mainstream ICT industry and create an information, communication and automation (ICAT) industry. Lately the Internet of Things (loT) and weightless networks, a new standard leveraging frequency channels earlier occupied by TV broadcasting, have gradually started to change the rigid world of Machine to Machine (M2M) interaction. It is foreseeable that enough momentum will be created that the industrial automation market will in due course face an architecture disruption empowered by these new trends. This thesis examines the current state of industrial automation subject to the competition between the incumbents firstly through a research on cost competitiveness efforts in captive outsourcing of engineering, research and development and secondly researching process re- engineering in the case of complex system global software support. Thirdly we investigate the industry actors', namely customers, incumbents and newcomers, views on the future direction of industrial automation and conclude with our assessments of the possible routes industrial automation could advance taking into account the looming rise of the Internet of Things (loT) and weightless networks. Industrial automation is an industry dominated by a handful of global players each of them focusing on maintaining their own proprietary solutions. The rise of de facto standards like IBM PC, Unix and Linux and SoC leveraged by IBM, Compaq, Dell, HP, ARM, Apple, Google, Samsung and others have created new markets of personal computers, smartphone and tablets and will eventually also impact industrial automation through game changing commoditization and related control point and business model changes. This trend will inevitably continue, but the transition to a commoditized industrial automation will not happen in the near future.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

With the shift towards many-core computer architectures, dataflow programming has been proposed as one potential solution for producing software that scales to a varying number of processor cores. Programming for parallel architectures is considered difficult as the current popular programming languages are inherently sequential and introducing parallelism is typically up to the programmer. Dataflow, however, is inherently parallel, describing an application as a directed graph, where nodes represent calculations and edges represent a data dependency in form of a queue. These queues are the only allowed communication between the nodes, making the dependencies between the nodes explicit and thereby also the parallelism. Once a node have the su cient inputs available, the node can, independently of any other node, perform calculations, consume inputs, and produce outputs. Data ow models have existed for several decades and have become popular for describing signal processing applications as the graph representation is a very natural representation within this eld. Digital lters are typically described with boxes and arrows also in textbooks. Data ow is also becoming more interesting in other domains, and in principle, any application working on an information stream ts the dataflow paradigm. Such applications are, among others, network protocols, cryptography, and multimedia applications. As an example, the MPEG group standardized a dataflow language called RVC-CAL to be use within reconfigurable video coding. Describing a video coder as a data ow network instead of with conventional programming languages, makes the coder more readable as it describes how the video dataflows through the different coding tools. While dataflow provides an intuitive representation for many applications, it also introduces some new problems that need to be solved in order for data ow to be more widely used. The explicit parallelism of a dataflow program is descriptive and enables an improved utilization of available processing units, however, the independent nodes also implies that some kind of scheduling is required. The need for efficient scheduling becomes even more evident when the number of nodes is larger than the number of processing units and several nodes are running concurrently on one processor core. There exist several data ow models of computation, with different trade-offs between expressiveness and analyzability. These vary from rather restricted but statically schedulable, with minimal scheduling overhead, to dynamic where each ring requires a ring rule to evaluated. The model used in this work, namely RVC-CAL, is a very expressive language, and in the general case it requires dynamic scheduling, however, the strong encapsulation of dataflow nodes enables analysis and the scheduling overhead can be reduced by using quasi-static, or piecewise static, scheduling techniques. The scheduling problem is concerned with nding the few scheduling decisions that must be run-time, while most decisions are pre-calculated. The result is then an, as small as possible, set of static schedules that are dynamically scheduled. To identify these dynamic decisions and to find the concrete schedules, this thesis shows how quasi-static scheduling can be represented as a model checking problem. This involves identifying the relevant information to generate a minimal but complete model to be used for model checking. The model must describe everything that may affect scheduling of the application while omitting everything else in order to avoid state space explosion. This kind of simplification is necessary to make the state space analysis feasible. For the model checker to nd the actual schedules, a set of scheduling strategies are de ned which are able to produce quasi-static schedulers for a wide range of applications. The results of this work show that actor composition with quasi-static scheduling can be used to transform data ow programs to t many different computer architecture with different type and number of cores. This in turn, enables dataflow to provide a more platform independent representation as one application can be fitted to a specific processor architecture without changing the actual program representation. Instead, the program representation is in the context of design space exploration optimized by the development tools to fit the target platform. This work focuses on representing the dataflow scheduling problem as a model checking problem and is implemented as part of a compiler infrastructure. The thesis also presents experimental results as evidence of the usefulness of the approach.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A parallel processor architecture based on a communicating sequential processor chip, the transputer, is described. The architecture is easily linearly extensible to enable separate functions to be included in the controller. To demonstrate the power of the resulting controller some experimental results are presented comparing PID and full inverse dynamics on the first three joints of a Puma 560 robot. Also examined are some of the sample rate issues raised by the asynchronous updating of inertial parameters, and the need for full inverse dynamics at every sample interval is questioned.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Con este proyecto se pretende crear un procedimiento general para la implantación de aplicaciones de procesado de imágenes en cámaras de video IP y la distribución de dicha información mediante Arquitecturas Orientadas a Servicios (SOA). El objetivo principal es crear una aplicación que se ejecute en una cámara de video IP y realice un procesado básico sobre las imágenes capturadas (detección de colores, formas y patrones) permitiendo distribuir el resultado del procesado mediante las arquitecturas SOA descritas en la especificación DPWS (Device Profile for Web Services). El estudio se va a centrar principalmente en la transformación automática de código de procesado de imágenes escrito en Matlab (archivos .m) a un código C ANSI (archivos .c) que posteriormente se compilará para la arquitectura del procesador de la cámara (arquitectura CRIS, similar a la RISC pero con un conjunto reducido de instrucciones). ABSTRACT. This project aims to create a general procedure for the implementation of image processing applications in IP video cameras and the distribution of such information through Service Oriented Architectures (SOA). The main goal is to create an application that runs on IP video camera and carry out a basic processing on the captured images ( color detection, shapes and patterns) allowing to distribute the result of process by SOA architectures described in the DPWS specification (Device Profile for Web Services). The study will focus primarily on the automated transform of image processing code written in Matlab files (. M) to ANSI C code files (. C) which is then compiled to the processor architecture of the camera (CRIS architecture , similar to the RISC but with a reduced instruction set).

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Paper submitted to Euromicro Symposium on Digital Systems Design (DSD), Belek-Antalya, Turkey, 2003.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Over the past few decades, we have been enjoying tremendous benefits thanks to the revolutionary advancement of computing systems, driven mainly by the remarkable semiconductor technology scaling and the increasingly complicated processor architecture. However, the exponentially increased transistor density has directly led to exponentially increased power consumption and dramatically elevated system temperature, which not only adversely impacts the system's cost, performance and reliability, but also increases the leakage and thus the overall power consumption. Today, the power and thermal issues have posed enormous challenges and threaten to slow down the continuous evolvement of computer technology. Effective power/thermal-aware design techniques are urgently demanded, at all design abstraction levels, from the circuit-level, the logic-level, to the architectural-level and the system-level. ^ In this dissertation, we present our research efforts to employ real-time scheduling techniques to solve the resource-constrained power/thermal-aware, design-optimization problems. In our research, we developed a set of simple yet accurate system-level models to capture the processor's thermal dynamic as well as the interdependency of leakage power consumption, temperature, and supply voltage. Based on these models, we investigated the fundamental principles in power/thermal-aware scheduling, and developed real-time scheduling techniques targeting at a variety of design objectives, including peak temperature minimization, overall energy reduction, and performance maximization. ^ The novelty of this work is that we integrate the cutting-edge research on power and thermal at the circuit and architectural-level into a set of accurate yet simplified system-level models, and are able to conduct system-level analysis and design based on these models. The theoretical study in this work serves as a solid foundation for the guidance of the power/thermal-aware scheduling algorithms development in practical computing systems.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes and describes an architecture that allows the both engineer and programmer for defining and quantifying which peripheral of a microcontroller will be important to the particular project. For each application, it is necessary to use different types of peripherals. In this study, we have verified the possibility for emulating the behavior of peripheral in specifically CPUs. These CPUs hold a RAM memory, where code spaces specifically written for them could represent the behavior of some target peripheral, which are loaded and executed on it. We believed that the proposed architecture will provide larger flexibility in the use of the microcontrolles since this ""dedicated hardware components"" don`t execute to a special function, but it is a hardware capable to self adapt to the needs of each project. This research had as fundament a comparative study of four current microcontrollers. Preliminary tests using VHDL and FPGAs were done.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we survey the most relevant results for the prioritybased schedulability analysis of real-time tasks, both for the fixed and dynamic priority assignment schemes. We give emphasis to the worst-case response time analysis in non-preemptive contexts, which is fundamental for the communication schedulability analysis. We define an architecture to support priority-based scheduling of messages at the application process level of a specific fieldbus communication network, the PROFIBUS. The proposed architecture improves the worst-case messages’ response time, overcoming the limitation of the first-come-first-served (FCFS) PROFIBUS queue implementations.