5 resultados para compiler

em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Diplomityö tarkastelee säikeistettyä ohjelmointia rinnakkaisohjelmoinnin ylemmällä hierarkiatasolla tarkastellen erityisesti hypersäikeistysteknologiaa. Työssä tarkastellaan hypersäikeistyksen hyviä ja huonoja puolia sekä sen vaikutuksia rinnakkaisalgoritmeihin. Työn tavoitteena oli ymmärtää Intel Pentium 4 prosessorin hypersäikeistyksen toteutus ja mahdollistaa sen hyödyntäminen, missä se tuo suorituskyvyllistä etua. Työssä kerättiin ja analysoitiin suorituskykytietoa ajamalla suuri joukko suorituskykytestejä eri olosuhteissa (muistin käsittely, kääntäjän asetukset, ympäristömuuttujat...). Työssä tarkasteltiin kahdentyyppisiä algoritmeja: matriisioperaatioita ja lajittelua. Näissä sovelluksissa on säännöllinen muistinkäyttökuvio, mikä on kaksiteräinen miekka. Se on etu aritmeettis-loogisissa prosessoinnissa, mutta toisaalta huonontaa muistin suorituskykyä. Syynä siihen on nykyaikaisten prosessorien erittäin hyvä raaka suorituskyky säännöllistä dataa käsiteltäessä, mutta muistiarkkitehtuuria rajoittaa välimuistien koko ja useat puskurit. Kun ongelman koko ylittää tietyn rajan, todellinen suorituskyky voi pudota murto-osaan huippusuorituskyvystä.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

As the development of integrated circuit technology continues to follow Moore’s law the complexity of circuits increases exponentially. Traditional hardware description languages such as VHDL and Verilog are no longer powerful enough to cope with this level of complexity and do not provide facilities for hardware/software codesign. Languages such as SystemC are intended to solve these problems by combining the powerful expression of high level programming languages and hardware oriented facilities of hardware description languages. To fully replace older languages in the desing flow of digital systems SystemC should also be synthesizable. The devices required by modern high speed networks often share the same tight constraints for e.g. size, power consumption and price with embedded systems but have also very demanding real time and quality of service requirements that are difficult to satisfy with general purpose processors. Dedicated hardware blocks of an application specific instruction set processor are one way to combine fast processing speed, energy efficiency, flexibility and relatively low time-to-market. Common features can be identified in the network processing domain making it possible to develop specialized but configurable processor architectures. One such architecture is the TACO which is based on transport triggered architecture. The architecture offers a high degree of parallelism and modularity and greatly simplified instruction decoding. For this M.Sc.(Tech) thesis, a simulation environment for the TACO architecture was developed with SystemC 2.2 using an old version written with SystemC 1.0 as a starting point. The environment enables rapid design space exploration by providing facilities for hw/sw codesign and simulation and an extendable library of automatically configured reusable hardware blocks. Other topics that are covered are the differences between SystemC 1.0 and 2.2 from the viewpoint of hardware modeling, and compilation of a SystemC model into synthesizable VHDL with Celoxica Agility SystemC Compiler. A simulation model for a processor for TCP/IP packet validation was designed and tested as a test case for the environment.

Relevância:

10.00% 10.00%

Publicador:

Relevância:

10.00% 10.00%

Publicador:

Resumo:

With the shift towards many-core computer architectures, dataflow programming has been proposed as one potential solution for producing software that scales to a varying number of processor cores. Programming for parallel architectures is considered difficult as the current popular programming languages are inherently sequential and introducing parallelism is typically up to the programmer. Dataflow, however, is inherently parallel, describing an application as a directed graph, where nodes represent calculations and edges represent a data dependency in form of a queue. These queues are the only allowed communication between the nodes, making the dependencies between the nodes explicit and thereby also the parallelism. Once a node have the su cient inputs available, the node can, independently of any other node, perform calculations, consume inputs, and produce outputs. Data ow models have existed for several decades and have become popular for describing signal processing applications as the graph representation is a very natural representation within this eld. Digital lters are typically described with boxes and arrows also in textbooks. Data ow is also becoming more interesting in other domains, and in principle, any application working on an information stream ts the dataflow paradigm. Such applications are, among others, network protocols, cryptography, and multimedia applications. As an example, the MPEG group standardized a dataflow language called RVC-CAL to be use within reconfigurable video coding. Describing a video coder as a data ow network instead of with conventional programming languages, makes the coder more readable as it describes how the video dataflows through the different coding tools. While dataflow provides an intuitive representation for many applications, it also introduces some new problems that need to be solved in order for data ow to be more widely used. The explicit parallelism of a dataflow program is descriptive and enables an improved utilization of available processing units, however, the independent nodes also implies that some kind of scheduling is required. The need for efficient scheduling becomes even more evident when the number of nodes is larger than the number of processing units and several nodes are running concurrently on one processor core. There exist several data ow models of computation, with different trade-offs between expressiveness and analyzability. These vary from rather restricted but statically schedulable, with minimal scheduling overhead, to dynamic where each ring requires a ring rule to evaluated. The model used in this work, namely RVC-CAL, is a very expressive language, and in the general case it requires dynamic scheduling, however, the strong encapsulation of dataflow nodes enables analysis and the scheduling overhead can be reduced by using quasi-static, or piecewise static, scheduling techniques. The scheduling problem is concerned with nding the few scheduling decisions that must be run-time, while most decisions are pre-calculated. The result is then an, as small as possible, set of static schedules that are dynamically scheduled. To identify these dynamic decisions and to find the concrete schedules, this thesis shows how quasi-static scheduling can be represented as a model checking problem. This involves identifying the relevant information to generate a minimal but complete model to be used for model checking. The model must describe everything that may affect scheduling of the application while omitting everything else in order to avoid state space explosion. This kind of simplification is necessary to make the state space analysis feasible. For the model checker to nd the actual schedules, a set of scheduling strategies are de ned which are able to produce quasi-static schedulers for a wide range of applications. The results of this work show that actor composition with quasi-static scheduling can be used to transform data ow programs to t many different computer architecture with different type and number of cores. This in turn, enables dataflow to provide a more platform independent representation as one application can be fitted to a specific processor architecture without changing the actual program representation. Instead, the program representation is in the context of design space exploration optimized by the development tools to fit the target platform. This work focuses on representing the dataflow scheduling problem as a model checking problem and is implemented as part of a compiler infrastructure. The thesis also presents experimental results as evidence of the usefulness of the approach.