13 resultados para Parallelism

em Queensland University of Technology - ePrints Archive


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Technical Report to accompany Ownership for Reasoning About Parallelism. Documents type system which captures effects and the operational semantics for the language which is presented as part of the paper.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the emergence of multi-cores into the mainstream, there is a growing need for systems to allow programmers and automated systems to reason about data dependencies and inherent parallelismin imperative object-oriented languages. In this paper we exploit the structure of object-oriented programs to abstract computational side-effects. We capture and validate these effects using a static type system. We use these as the basis of sufficient conditions for several different data and task parallelism patterns. We compliment our static type system with a lightweight runtime system to allow for parallelization in the presence of complex data flows. We have a functioning compiler and worked examples to demonstrate the practicality of our solution.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the emergence of multi-core processors into the mainstream, parallel programming is no longer the specialized domain it once was. There is a growing need for systems to allow programmers to more easily reason about data dependencies and inherent parallelism in general purpose programs. Many of these programs are written in popular imperative programming languages like Java and C]. In this thesis I present a system for reasoning about side-effects of evaluation in an abstract and composable manner that is suitable for use by both programmers and automated tools such as compilers. The goal of developing such a system is to both facilitate the automatic exploitation of the inherent parallelism present in imperative programs and to allow programmers to reason about dependencies which may be limiting the parallelism available for exploitation in their applications. Previous work on languages and type systems for parallel computing has tended to focus on providing the programmer with tools to facilitate the manual parallelization of programs; programmers must decide when and where it is safe to employ parallelism without the assistance of the compiler or other automated tools. None of the existing systems combine abstraction and composition with parallelization and correctness checking to produce a framework which helps both programmers and automated tools to reason about inherent parallelism. In this work I present a system for abstractly reasoning about side-effects and data dependencies in modern, imperative, object-oriented languages using a type and effect system based on ideas from Ownership Types. I have developed sufficient conditions for the safe, automated detection and exploitation of a number task, data and loop parallelism patterns in terms of ownership relationships. To validate my work, I have applied my ideas to the C] version 3.0 language to produce a language extension called Zal. I have implemented a compiler for the Zal language as an extension of the GPC] research compiler as a proof of concept of my system. I have used it to parallelize a number of real-world applications to demonstrate the feasibility of my proposed approach. In addition to this empirical validation, I present an argument for the correctness of the type system and language semantics I have proposed as well as sketches of proofs for the correctness of the sufficient conditions for parallelization proposed.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Streaming SIMD extension (SSE) is a special feature embedded in the Intel Pentium III and IV classes of microprocessors. It enables the execution of SIMD type operations to exploit data parallelism. This article presents improving computation performance of a railway network simulator by means of SSE. Voltage and current at various points of the supply system to an electrified railway line are crucial for design, daily operation and planning. With computer simulation, their time-variations can be attained by solving a matrix equation, whose size mainly depends upon the number of trains present in the system. A large coefficient matrix, as a result of congested railway line, inevitably leads to heavier computational demand and hence jeopardizes the simulation speed. With the special architectural features of the latest processors on PC platforms, significant speed-up in computations can be achieved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Streaming SIMD Extensions (SSE) is a unique feature embedded in the Pentium III and IV classes of microprocessors. By fully exploiting SSE, parallel algorithms can be implemented on a standard personal computer and a theoretical speedup of four can be achieved. In this paper, we demonstrate the implementation of a parallel LU matrix decomposition algorithm for solving linear systems with SSE and discuss advantages and disadvantages of this approach based on our experimental study.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

There are many applications in aeronautics where there exist strong couplings between disciplines. One practical example is within the context of Unmanned Aerial Vehicle(UAV) automation where there exists strong coupling between operation constraints, aerodynamics, vehicle dynamics, mission and path planning. UAV path planning can be done either online or offline. The current state of path planning optimisation online UAVs with high performance computation is not at the same level as its ground-based offline optimizer's counterpart, this is mainly due to the volume, power and weight limitations on the UAV; some small UAVs do not have the computational power needed for some optimisation and path planning task. In this paper, we describe an optimisation method which can be applied to Multi-disciplinary Design Optimisation problems and UAV path planning problems. Hardware-based design optimisation techniques are used. The power and physical limitations of UAV, which may not be a problem in PC-based solutions, can be approached by utilizing a Field Programmable Gate Array (FPGA) as an algorithm accelerator. The inevitable latency produced by the iterative process of an Evolutionary Algorithm (EA) is concealed by exploiting the parallelism component within the dataflow paradigm of the EA on an FPGA architecture. Results compare software PC-based solutions and the hardware-based solutions for benchmark mathematical problems as well as a simple real world engineering problem. Results also indicate the practicality of the method which can be used for more complex single and multi objective coupled problems in aeronautical applications.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Experimental and theoretical studies have shown the importance of stochastic processes in genetic regulatory networks and cellular processes. Cellular networks and genetic circuits often involve small numbers of key proteins such as transcriptional factors and signaling proteins. In recent years stochastic models have been used successfully for studying noise in biological pathways, and stochastic modelling of biological systems has become a very important research field in computational biology. One of the challenge problems in this field is the reduction of the huge computing time in stochastic simulations. Based on the system of the mitogen-activated protein kinase cascade that is activated by epidermal growth factor, this work give a parallel implementation by using OpenMP and parallelism across the simulation. Special attention is paid to the independence of the generated random numbers in parallel computing, that is a key criterion for the success of stochastic simulations. Numerical results indicate that parallel computers can be used as an efficient tool for simulating the dynamics of large-scale genetic regulatory networks and cellular processes

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Many computationally intensive scientific applications involve repetitive floating point operations other than addition and multiplication which may present a significant performance bottleneck due to the relatively large latency or low throughput involved in executing such arithmetic primitives on commod- ity processors. A promising alternative is to execute such primitives on Field Programmable Gate Array (FPGA) hardware acting as an application-specific custom co-processor in a high performance reconfig- urable computing platform. The use of FPGAs can provide advantages such as fine-grain parallelism but issues relating to code development in a hardware description language and efficient data transfer to and from the FPGA chip can present significant application development challenges. In this paper, we discuss our practical experiences in developing a selection of floating point hardware designs to be implemented using FPGAs. Our designs include some basic mathemati cal library functions which can be implemented for user defined precisions suitable for novel applications requiring non-standard floating point represen- tation. We discuss the details of our designs along with results from performance and accuracy analysis tests.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

MapReduce frameworks such as Hadoop are well suited to handling large sets of data which can be processed separately and independently, with canonical applications in information retrieval and sales record analysis. Rapid advances in sequencing technology have ensured an explosion in the availability of genomic data, with a consequent rise in the importance of large scale comparative genomics, often involving operations and data relationships which deviate from the classical Map Reduce structure. This work examines the application of Hadoop to patterns of this nature, using as our focus a wellestablished workflow for identifying promoters - binding sites for regulatory proteins - Across multiple gene regions and organisms, coupled with the unifying step of assembling these results into a consensus sequence. Our approach demonstrates the utility of Hadoop for problems of this nature, showing how the tyranny of the "dominant decomposition" can be at least partially overcome. It also demonstrates how load balance and the granularity of parallelism can be optimized by pre-processing that splits and reorganizes input files, allowing a wide range of related problems to be brought under the same computational umbrella.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The work presented in this report is aimed to implement a cost-effective offline mission path planner for aerial inspection tasks of large linear infrastructures. Like most real-world optimisation problems, mission path planning involves a number of objectives which ideally should be minimised simultaneously. Understandably, the objectives of a practical optimisation problem are conflicting each other and the minimisation of one of them necessarily implies the impossibility to minimise the other ones. This leads to the need to find a set of optimal solutions for the problem; once such a set of available options is produced, the mission planning problem is reduced to a decision making problem for the mission specialists, who will choose the solution which best fit the requirements of the mission. The goal of this work is then to develop a Multi-Objective optimisation tool able to provide the mission specialists a set of optimal solutions for the inspection task amongst which the final trajectory will be chosen, given the environment data, the mission requirements and the definition of the objectives to minimise. All the possible optimal solutions of a Multi-Objective optimisation problem are said to form the Pareto-optimal front of the problem. For any of the Pareto-optimal solutions, it is impossible to improve one objective without worsening at least another one. Amongst a set of Pareto-optimal solutions, no solution is absolutely better than another and the final choice must be a trade-off of the objectives of the problem. Multi-Objective Evolutionary Algorithms (MOEAs) are recognised to be a convenient method for exploring the Pareto-optimal front of Multi-Objective optimization problems. Their efficiency is due to their parallelism architecture which allows to find several optimal solutions at each time