131 resultados para Parallel computation
em Queensland University of Technology - ePrints Archive
Resumo:
Streaming SIMD Extensions (SSE) is a unique feature embedded in the Pentium III and P4 classes of microprocessors. By fully exploiting SSE, parallel algorithms can be implemented on a standard personal computer and a theoretical speedup of four can be achieved. In this paper, we demonstrate the implementation of a parallel LU matrix decomposition algorithm for solving power systems network equations with SSE and discuss advantages and disadvantages of this approach.
Resumo:
This paper compares the performances of two different optimisation techniques for solving inverse problems; the first one deals with the Hierarchical Asynchronous Parallel Evolutionary Algorithms software (HAPEA) and the second is implemented with a game strategy named Nash-EA. The HAPEA software is based on a hierarchical topology and asynchronous parallel computation. The Nash-EA methodology is introduced as a distributed virtual game and consists of splitting the wing design variables - aerofoil sections - supervised by players optimising their own strategy. The HAPEA and Nash-EA software methodologies are applied to a single objective aerodynamic ONERA M6 wing reconstruction. Numerical results from the two approaches are compared in terms of the quality of model and computational expense and demonstrate the superiority of the distributed Nash-EA methodology in a parallel environment for a similar design quality.
Resumo:
The Node-based Local Mesh Generation (NLMG) algorithm, which is free of mesh inconsistency, is one of core algorithms in the Node-based Local Finite Element Method (NLFEM) to achieve the seamless link between mesh generation and stiffness matrix calculation, and the seamless link helps to improve the parallel efficiency of FEM. Furthermore, the key to ensure the efficiency and reliability of NLMG is to determine the candidate satellite-node set of a central node quickly and accurately. This paper develops a Fast Local Search Method based on Uniform Bucket (FLSMUB) and a Fast Local Search Method based on Multilayer Bucket (FLSMMB), and applies them successfully to the decisive problems, i.e. presenting the candidate satellite-node set of any central node in NLMG algorithm. Using FLSMUB or FLSMMB, the NLMG algorithm becomes a practical tool to reduce the parallel computation cost of FEM. Parallel numerical experiments validate that either FLSMUB or FLSMMB is fast, reliable and efficient for their suitable problems and that they are especially effective for computing the large-scale parallel problems.
Resumo:
The load–frequency control (LFC) problem has been one of the major subjects in a power system. In practice, LFC systems use proportional–integral (PI) controllers. However since these controllers are designed using a linear model, the non-linearities of the system are not accounted for and they are incapable of gaining good dynamical performance for a wide range of operating conditions in a multi-area power system. A strategy for solving this problem because of the distributed nature of a multi-area power system is presented by using a multi-agent reinforcement learning (MARL) approach. It consists of two agents in each power area; the estimator agent provides the area control error (ACE) signal based on the frequency bias estimation and the controller agent uses reinforcement learning to control the power system in which genetic algorithm optimisation is used to tune its parameters. This method does not depend on any knowledge of the system and it admits considerable flexibility in defining the control objective. Also, by finding the ACE signal based on the frequency bias estimation the LFC performance is improved and by using the MARL parallel, computation is realised, leading to a high degree of scalability. Here, to illustrate the accuracy of the proposed approach, a three-area power system example is given with two scenarios.
Resumo:
Considerate amount of research has proposed optimization-based approaches employing various vibration parameters for structural damage diagnosis. The damage detection by these methods is in fact a result of updating the analytical structural model in line with the current physical model. The feasibility of these approaches has been proven. But most of the verification has been done on simple structures, such as beams or plates. In the application on a complex structure, like steel truss bridges, a traditional optimization process will cost massive computational resources and lengthy convergence. This study presents a multi-layer genetic algorithm (ML-GA) to overcome the problem. Unlike the tedious convergence process in a conventional damage optimization process, in each layer, the proposed algorithm divides the GA’s population into groups with a less number of damage candidates; then, the converged population in each group evolves as an initial population of the next layer, where the groups merge to larger groups. In a damage detection process featuring ML-GA, as parallel computation can be implemented, the optimization performance and computational efficiency can be enhanced. In order to assess the proposed algorithm, the modal strain energy correlation (MSEC) has been considered as the objective function. Several damage scenarios of a complex steel truss bridge’s finite element model have been employed to evaluate the effectiveness and performance of ML-GA, against a conventional GA. In both single- and multiple damage scenarios, the analytical and experimental study shows that the MSEC index has achieved excellent damage indication and efficiency using the proposed ML-GA, whereas the conventional GA only converges at a local solution.
Resumo:
The Streaming SIMD extension (SSE) is a special feature embedded in the Intel Pentium III and IV classes of microprocessors. It enables the execution of SIMD type operations to exploit data parallelism. This article presents improving computation performance of a railway network simulator by means of SSE. Voltage and current at various points of the supply system to an electrified railway line are crucial for design, daily operation and planning. With computer simulation, their time-variations can be attained by solving a matrix equation, whose size mainly depends upon the number of trains present in the system. A large coefficient matrix, as a result of congested railway line, inevitably leads to heavier computational demand and hence jeopardizes the simulation speed. With the special architectural features of the latest processors on PC platforms, significant speed-up in computations can be achieved.
Resumo:
The Streaming SIMD extension (SSE) is a special feature that is available in the Intel Pentium III and P4 classes of microprocessors. As its name implies, SSE enables the execution of SIMD (Single Instruction Multiple Data) operations upon 32-bit floating-point data therefore, performance of floating-point algorithms can be improved. In electrified railway system simulation, the computation involves the solving of a huge set of simultaneous linear equations, which represent the electrical characteristic of the railway network at a particular time-step and a fast solution for the equations is desirable in order to simulate the system in real-time. In this paper, we present how SSE is being applied to the railway network simulation.
Resumo:
A composite SaaS (Software as a Service) is a software that is comprised of several software components and data components. The composite SaaS placement problem is to determine where each of the components should be deployed in a cloud computing environment such that the performance of the composite SaaS is optimal. From the computational point of view, the composite SaaS placement problem is a large-scale combinatorial optimization problem. Thus, an Iterative Cooperative Co-evolutionary Genetic Algorithm (ICCGA) was proposed. The ICCGA can find reasonable quality of solutions. However, its computation time is noticeably slow. Aiming at improving the computation time, we propose an unsynchronized Parallel Cooperative Co-evolutionary Genetic Algorithm (PCCGA) in this paper. Experimental results have shown that the PCCGA not only has quicker computation time, but also generates better quality of solutions than the ICCGA.
Resumo:
As computational models in fields such as medicine and engineering get more refined, resource requirements are increased. In a first instance, these needs have been satisfied using parallel computing and HPC clusters. However, such systems are often costly and lack flexibility. HPC users are therefore tempted to move to elastic HPC using cloud services. One difficulty in making this transition is that HPC and cloud systems are different, and performance may vary. The purpose of this study is to evaluate cloud services as a means to minimise both cost and computation time for large-scale simulations, and to identify which system properties have the most significant impact on performance. Our simulation results show that, while the performance of Virtual CPU (VCPU) is satisfactory, network throughput may lead to difficulties.
Resumo:
Tridiagonal diagonally dominant linear systems arise in many scientific and engineering applications. The standard Thomas algorithm for solving such systems is inherently serial forming a bottleneck in computation. Algorithms such as cyclic reduction and SPIKE reduce a single large tridiagonal system into multiple small independent systems which can be solved in parallel. We have developed portable cyclic reduction and SPIKE algorithm OpenCL implementations with the intent to target a range of co-processors in a heterogeneous computing environment including Field Programmable Gate Arrays (FPGAs), Graphics Processing Units (GPUs) and other multi-core processors. In this paper, we evaluate these designs in the context of solver performance, resource efficiency and numerical accuracy.
Resumo:
Investigated the psychometric properties of the original and alternate sets of the Trail Making Test (TMT) and the Controlled Oral Word Association Test (COWAT; A. L. Benton and D. Hamsher, 1978) in 50 orthopedic and 15 closed head injured (1 yr after trauma) patients (aged 15–59 yrs). Although the alternate forms of both measures proved to be stable and consistent with each other in both groups, only the parallel sets of TMT reliably discriminated the clinical group from controls. Practice effects in the head injured were significant only for Trail B of TMT. Factor analysis of the control group's results identified Verbal Knowledge as a major contributor to performance on COWAT, whereas TMT was more dependent on Rapid Visual Search and Visuomotor Sequencing.