169 resultados para Parallelizing Compilers


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Previous work on formally modelling and analysing program compilation has shown the need for a simple and expressive semantics for assembler level programs. Assembler programs contain unstructured jumps and previous formalisms have modelled these by using continuations, or by embedding the program in an explicit emulator. We propose a simpler approach, which uses techniques from compiler theory in a formal setting. This approach is based on an interpretation of programs as collections of program paths, each of which has a weakest liberal precondition semantics. We then demonstrate, by example, how we can use this formalism to justify the compilation of block-structured high-level language programs into assembler.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Since the advent of High Level Programming languages (HLPLs) in the early 1950s researchers have sought ways to automate the construction of HLPL compilers. To this end a variety of Translator Writing Tools (TWTs) have been developed in the last three decades. However, only a very few of these tools have gained significant commercial acceptance. This thesis re-examines traditional compiler construction techniques, along with a number of previous TWTs, and proposes a new improved tool for automated compiler construction called the Aston Compiler Constructor (ACC). This new tool allows the specification of complete compilation systems using a high level compiler oriented specification notation called the Compiler Construction Language (CCL). This specification notation is based on a modern variant of Backus Naur Form (BNF) and an extended variant of Attribute Grammars (AGs). The implementation and processing of the CCL is discussed along with an extensive CCL example. The CCL is shown to have an extensive expressive power, to be convenient in use, and highly readable, and thus a superior alternative to earlier TWTs, and to traditional compiler construction techniques. The execution performance of CCL specifications is evaluated and shown to be acceptable. A number of related areas are also addressed, including tools for the rapid construction of individual compiler components, and tools for the construction of compilation systems for multiprocessor operating systems and hardware. This latter area is expected to become of particular interest in future years due to the anticipated increased use of multiprocessor architectures.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The paper presents basic notions and scientific achievements in the field of program transformations, describes usage of these achievements both in the professional activity (when developing optimizing and unparallelizing compilers) and in the higher education. It also analyzes main problems in this area. The concept of control of program transformation information is introduced in the form of specialized knowledge bank on computer program transformations to support the scientific research, education and professional activity in the field. The tasks that are solved by the knowledge bank are formulated. The paper is intended for experts in the artificial intelligence, optimizing compilation, postgraduates and senior students of corresponding specialties; it may be also interesting for university lecturers and instructors.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Mixed-content miscellanies (very frequent in the Byzantine and mediaeval Slavic written heritage) are usually defined as collections of works with non-occupational, non-liturgical application, and texts in them are selected and arranged according to no identifiable principle. It is a “readable” type of miscellanies which were compiled mainly on the basis of the cognitive interests of compilers and readers. Just like the occupational ones, they also appeared to satisfy public needs but were intended for individual usage. My textological comparison had shown that mixed- content miscellanies often showed evidence of a stable content – some of them include the same constituent works in the same order, regardless that the manuscripts had no obvious genetic relationship. These correspondences were sufficiently numerous and distinctive that they could not be merely fortuitous, and the only sensible interpretation was that even when the operative organizational principle was not based on independently identifiable criteria, such as the church calendar, liturgical function, or thematic considerations, mixed-content miscellanies (or, at least, portions of their contents) nonetheless fell into types. In this respect, the apparent free selection and arrangement of texts in mixed-content miscellanies turns out to be illusory. The problem was – as the corpus of manuscripts that I and my colleagues needed to examine grew – our ability to keep track of the structure of each one, and to identify structural correspondences among manuscripts within the corpus, diminished. So, at the end of 1993 I addressed a letter to Prof. David Birnbaum (University of Pittsburgh, PA) with a request to help me to solve the problem. He and my colleague Andrey Boyadzhiev (Sofia University) pointed out to me that computers are well suited to recording, processing, and analyzing large amounts of data, and to identifying patterns within the data, and their proposal was that we try to develop a computer system for description of manuscripts, for their analysis and of course, for searching the data. Our collaboration in this project is now ten years old, and our talk today presents an overview of that collaboration.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The paper presents a short review of some systems for program transformations performed on the basis of the internal intermediate representations of these programs. Many systems try to support several languages of representation of the source texts of programs and solve the task of their translation into the internal representation. This task is still a challenge as it is effort-consuming. To reduce the effort, different systems of translator construction, ready compilers with ready grammars of outside designers are used. Though this approach saves the effort, it has its drawbacks and constraints. The paper presents the general idea of using the mapping approach to solve the task within the framework of program transformations and overcome the disadvantages of the existing systems. The paper demonstrates a fragment of the ontology model of high-level languages mappings onto the single representation and gives the example of how the description of (a fragment) a particular mapping is represented in accordance with the ontology model.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We propose an ISA extension that decouples the data access and register write operations in a load instruction. We describe system and hardware support for decoupled loads. Furthermore, we show how compilers can generate better static instruction schedules by hoisting a decoupled load’s data access above may-alias stores and branches. We find that decoupled loads improve performance with geometric mean speedups of 8.4%.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Model for Prediction Across Scales (MPAS) is a novel set of Earth system simulation components and consists of an atmospheric model, an ocean model and a land-ice model. Its distinct features are the use of unstructured Voronoi meshes and C-grid discretisation to address shortcomings of global models on regular grids and the use of limited area models nested in a forcing data set, with respect to parallel scalability, numerical accuracy and physical consistency. This concept allows one to include the feedback of regional land use information on weather and climate at local and global scales in a consistent way, which is impossible to achieve with traditional limited area modelling approaches. Here, we present an in-depth evaluation of MPAS with regards to technical aspects of performing model runs and scalability for three medium-size meshes on four different high-performance computing (HPC) sites with different architectures and compilers. We uncover model limitations and identify new aspects for the model optimisation that are introduced by the use of unstructured Voronoi meshes. We further demonstrate the model performance of MPAS in terms of its capability to reproduce the dynamics of the West African monsoon (WAM) and its associated precipitation in a pilot study. Constrained by available computational resources, we compare 11-month runs for two meshes with observations and a reference simulation from the Weather Research and Forecasting (WRF) model. We show that MPAS can reproduce the atmospheric dynamics on global and local scales in this experiment, but identify a precipitation excess for the West African region. Finally, we conduct extreme scaling tests on a global 3?km mesh with more than 65 million horizontal grid cells on up to half a million cores. We discuss necessary modifications of the model code to improve its parallel performance in general and specific to the HPC environment. We confirm good scaling (70?% parallel efficiency or better) of the MPAS model and provide numbers on the computational requirements for experiments with the 3?km mesh. In doing so, we show that global, convection-resolving atmospheric simulations with MPAS are within reach of current and next generations of high-end computing facilities.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes a fast integer sorting algorithm, herein referred to as Bit-index sort, which does not use comparisons and is intended to sort partial permutations. Experimental results exhibit linear complexity order in execution time. Bit-index sort uses a bit-array to classify input sequences of distinct integers, and exploits built-in bit functions in C compilers, supported by machine hardware, to retrieve the ordered output sequence. Results show that Bit-index sort outperforms quicksort and counting sort algorithms when compared in their execution time. A parallel approach for Bit-index sort using two simultaneous threads is also included, which obtains further speedups of up to 1.6 compared to its sequential case.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-08

Relevância:

10.00% 10.00%

Publicador:

Resumo:

It is now clear that the concept of a HPC compiler which automatically produces highly efficient parallel implementations is a pipe-dream. Another route is to recognise from the outset that user information is required and to develop tools that embed user interaction in the transformation of code from scalar to parallel form, and then use conventional compilers with a set of communication calls. This represents the key idea underlying the development of the CAPTools software environment. The initial version of CAPTools is focused upon single block structured mesh computational mechanics codes. The capability for unstructured mesh codes is under test now and block structured meshes will be included next. The parallelisation process can be completed rapidly for modest codes and the parallel performance approaches that which is delivered by hand parallelisations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Many-core systems are emerging from the need of more computational power and power efficiency. However there are many issues which still revolve around the many-core systems. These systems need specialized software before they can be fully utilized and the hardware itself may differ from the conventional computational systems. To gain efficiency from many-core system, programs need to be parallelized. In many-core systems the cores are small and less powerful than cores used in traditional computing, so running a conventional program is not an efficient option. Also in Network-on-Chip based processors the network might get congested and the cores might work at different speeds. In this thesis is, a dynamic load balancing method is proposed and tested on Intel 48-core Single-Chip Cloud Computer by parallelizing a fault simulator. The maximum speedup is difficult to obtain due to severe bottlenecks in the system. In order to exploit all the available parallelism of the Single-Chip Cloud Computer, a runtime approach capable of dynamically balancing the load during the fault simulation process is used. The proposed dynamic fault simulation approach on the Single-Chip Cloud Computer shows up to 45X speedup compared to a serial fault simulation approach. Many-core systems can draw enormous amounts of power, and if this power is not controlled properly, the system might get damaged. One way to manage power is to set power budget for the system. But if this power is drawn by just few cores of the many, these few cores get extremely hot and might get damaged. Due to increase in power density multiple thermal sensors are deployed on the chip area to provide realtime temperature feedback for thermal management techniques. Thermal sensor accuracy is extremely prone to intra-die process variation and aging phenomena. These factors lead to a situation where thermal sensor values drift from the nominal values. This necessitates efficient calibration techniques to be applied before the sensor values are used. In addition, in modern many-core systems cores have support for dynamic voltage and frequency scaling. Thermal sensors located on cores are sensitive to the core's current voltage level, meaning that dedicated calibration is needed for each voltage level. In this thesis a general-purpose software-based auto-calibration approach is also proposed for thermal sensors to calibrate thermal sensors on different range of voltages.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Many functional programs can be viewed as representation changers, that is, as functions that convert abstract values from one concrete representation to another. Examples of such programs include base-converters, binary adders and multipliers, and compilers. In this paper we give a number of different approaches to specifying representation changers (pointwise, functional, and relational), and present a simple technique that can be used to derive functional programs from the specifications.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper draws a parallel between document preparation and the traditional processes of compilation and link editing for computer programs. A block-based document model is described which allows for separate compilation of various portions of a document. These portions are brought together and merged by a linker program, called dlink, whose pilot implementation is based on ditroff and on its underlying intermediate code. In the light of experiences with dlink the requirements for a universal object-module language for documents are discussed. These requirements often resemble the characteristics of the intermediate codes used by programming-language compilers but with interesting extra constraints which arise from the way documents are executed .

Relevância:

10.00% 10.00%

Publicador:

Resumo:

For various reasons, many Algol 68 compilers do not directly implement the parallel processing operations defined in the Revised Algol 68 Report. It is still possible however, to perform parallel processing, multitasking and simulation provided that the implementation permits the creation of a master routine for the coordination and initiation of processes under its control. The package described here is intended for real time applications and runs in conjunction with the Algol 68R system; it extends and develops the original Algol 68RT package, which was designed for use with multiplexers at the Royal Radar Establishment, Malvern. The facilities provided, in addition to the synchronising operations, include an interface to an ICL Communications Processor enabling the abstract processes to be realised as the interaction of several teletypes or visual display units with a real time program providing a useful service.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The persistence concern implemented as an aspect has been studied since the appearance of the Aspect-Oriented paradigm. Frequently, persistence is given as an example that can be aspectized, but until today no real world solution has applied that paradigm. Such solution should be able to enhance the programmer productivity and make the application less prone to errors. To test the viability of that concept, in a previous study we developed a prototype that implements Orthogonal Persistence as an aspect. This first version of the prototype was already fully functional with all Java types including arrays. In this work the results of our new research to overcome some limitations that we have identified on the data type abstraction and transparency in the prototype are presented. One of our goals was to avoid the Java standard idiom for genericity, based on casts, type tests and subtyping. Moreover, we also find the need to introduce some dynamic data type abilities. We consider that the Reflection is the solution to those issues. To achieve that, we have extended our prototype with a new static weaver that preprocesses the application source code in order to introduce changes to the normal behavior of the Java compiler with a new generated reflective code.