934 resultados para Simplex. CPLEXR. Parallel Efficiency. Parallel Scalability. Linear Programming


Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper we describe an hybrid algorithm for an even number of processors based on an algorithm for two processors and the Overlapping Partition Method for tridiagonal systems. Moreover, we compare this hybrid method with the Partition Wang’s method in a BSP computer. Finally, we compare the theoretical computation cost of both methods for a Cray T3D computer, using the cost model that BSP model provides.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Thesis (M.S.)--University of Illinois at Urbana-Champaign, 1966.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Includes bibliographical references.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Image segmentation is one of the most computationally intensive operations in image processing and computer vision. This is because a large volume of data is involved and many different features have to be extracted from the image data. This thesis is concerned with the investigation of practical issues related to the implementation of several classes of image segmentation algorithms on parallel architectures. The Transputer is used as the basic building block of hardware architectures and Occam is used as the programming language. The segmentation methods chosen for implementation are convolution, for edge-based segmentation; the Split and Merge algorithm for segmenting non-textured regions; and the Granlund method for segmentation of textured images. Three different convolution methods have been implemented. The direct method of convolution, carried out in the spatial domain, uses the array architecture. The other two methods, based on convolution in the frequency domain, require the use of the two-dimensional Fourier transform. Parallel implementations of two different Fast Fourier Transform algorithms have been developed, incorporating original solutions. For the Row-Column method the array architecture has been adopted, and for the Vector-Radix method, the pyramid architecture. The texture segmentation algorithm, for which a system-level design is given, demonstrates a further application of the Vector-Radix Fourier transform. A novel concurrent version of the quad-tree based Split and Merge algorithm has been implemented on the pyramid architecture. The performance of the developed parallel implementations is analysed. Many of the obtained speed-up and efficiency measures show values close to their respective theoretical maxima. Where appropriate comparisons are drawn between different implementations. The thesis concludes with comments on general issues related to the use of the Transputer system as a development tool for image processing applications; and on the issues related to the engineering of concurrent image processing applications.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

An iterative Monte Carlo algorithm for evaluating linear functionals of the solution of integral equations with polynomial non-linearity is proposed and studied. The method uses a simulation of branching stochastic processes. It is proved that the mathematical expectation of the introduced random variable is equal to a linear functional of the solution. The algorithm uses the so-called almost optimal density function. Numerical examples are considered. Parallel implementation of the algorithm is also realized using the package ATHAPASCAN as an environment for parallel realization.The computational results demonstrate high parallel efficiency of the presented algorithm and give a good solution when almost optimal density function is used as a transition density.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Femtosecond laser microfabrication has emerged over the last decade as a 3D flexible technology in photonics. Numerical simulations provide an important insight into spatial and temporal beam and pulse shaping during the course of extremely intricate nonlinear propagation (see e.g. [1,2]). Electromagnetics of such propagation is typically described in the form of the generalized Non-Linear Schrdinger Equation (NLSE) coupled with Drude model for plasma [3]. In this paper we consider a multi-threaded parallel numerical solution for a specific model which describes femtosecond laser pulse propagation in transparent media [4, 5]. However our approach can be extended to similar models. The numerical code is implemented in NVIDIA Graphics Processing Unit (GPU) which provides an effitient hardware platform for multi-threded computing. We compare the performance of the described below parallel code implementated for GPU using CUDA programming interface [3] with a serial CPU version used in our previous papers [4,5]. © 2011 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We describe a parallel multi-threaded approach for high performance modelling of wide class of phenomena in ultrafast nonlinear optics. Specific implementation has been performed using the highly parallel capabilities of a programmable graphics processor. © 2011 SPIE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The main focus of this research is to design and develop a high performance linear actuator based on a four bar mechanism. The present work includes the detailed analysis (kinematics and dynamics), design, implementation and experimental validation of the newly designed actuator. High performance is characterized by the acceleration of the actuator end effector. The principle of the newly designed actuator is to network the four bar rhombus configuration (where some bars are extended to form an X shape) to attain high acceleration. Firstly, a detailed kinematic analysis of the actuator is presented and kinematic performance is evaluated through MATLAB simulations. A dynamic equation of the actuator is achieved by using the Lagrangian dynamic formulation. A SIMULINK control model of the actuator is developed using the dynamic equation. In addition, Bond Graph methodology is presented for the dynamic simulation. The Bond Graph model comprises individual component modeling of the actuator along with control. Required torque was simulated using the Bond Graph model. Results indicate that, high acceleration (around 20g) can be achieved with modest (3 N-m or less) torque input. A practical prototype of the actuator is designed using SOLIDWORKS and then produced to verify the proof of concept. The design goal was to achieve the peak acceleration of more than 10g at the middle point of the travel length, when the end effector travels the stroke length (around 1 m). The actuator is primarily designed to operate in standalone condition and later to use it in the 3RPR parallel robot. A DC motor is used to operate the actuator. A quadrature encoder is attached with the DC motor to control the end effector. The associated control scheme of the actuator is analyzed and integrated with the physical prototype. From standalone experimentation of the actuator, around 17g acceleration was achieved by the end effector (stroke length was 0.2m to 0.78m). Results indicate that the developed dynamic model results are in good agreement. Finally, a Design of Experiment (DOE) based statistical approach is also introduced to identify the parametric combination that yields the greatest performance. Data are collected by using the Bond Graph model. This approach is helpful in designing the actuator without much complexity.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A scenario-based two-stage stochastic programming model for gas production network planning under uncertainty is usually a large-scale nonconvex mixed-integer nonlinear programme (MINLP), which can be efficiently solved to global optimality with nonconvex generalized Benders decomposition (NGBD). This paper is concerned with the parallelization of NGBD to exploit multiple available computing resources. Three parallelization strategies are proposed, namely, naive scenario parallelization, adaptive scenario parallelization, and adaptive scenario and bounding parallelization. Case study of two industrial natural gas production network planning problems shows that, while the NGBD without parallelization is already faster than a state-of-the-art global optimization solver by an order of magnitude, the parallelization can improve the efficiency by several times on computers with multicore processors. The adaptive scenario and bounding parallelization achieves the best overall performance among the three proposed parallelization strategies.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A new parallel approach for solving a pentadiagonal linear system is presented. The parallel partition method for this system and the TW parallel partition method on a chain of P processors are introduced and discussed. The result of this algorithm is a reduced pentadiagonal linear system of order P \Gamma 2 compared with a system of order 2P \Gamma 2 for the parallel partition method. More importantly the new method involves only half the number of communications startups than the parallel partition method (and other standard parallel methods) and hence is a far more efficient parallel algorithm.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The difficulties encountered in implementing large scale CM codes on multiprocessor systems are now fairly well understood. Despite the claims of shared memory architecture manufacturers to provide effective parallelizing compilers, these have not proved to be adequate for large or complex programs. Significant programmer effort is usually required to achieve reasonable parallel efficiencies on significant numbers of processors. The paradigm of Single Program Multi Data (SPMD) domain decomposition with message passing, where each processor runs the same code on a subdomain of the problem, communicating through exchange of messages, has for some time been demonstrated to provide the required level of efficiency, scalability, and portability across both shared and distributed memory systems, without the need to re-author the code into a new language or even to support differing message passing implementations. Extension of the methods into three dimensions has been enabled through the engineering of PHYSICA, a framework for supporting 3D, unstructured mesh and continuum mechanics modeling. In PHYSICA, six inspectors are used. Part of the challenge for automation of parallelization is being able to prove the equivalence of inspectors so that they can be merged into as few as possible.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Solving linear systems is an important problem for scientific computing. Exploiting parallelism is essential for solving complex systems, and this traditionally involves writing parallel algorithms on top of a library such as MPI. The SPIKE family of algorithms is one well-known example of a parallel solver for linear systems. The Hierarchically Tiled Array data type extends traditional data-parallel array operations with explicit tiling and allows programmers to directly manipulate tiles. The tiles of the HTA data type map naturally to the block nature of many numeric computations, including the SPIKE family of algorithms. The higher level of abstraction of the HTA enables the same program to be portable across different platforms. Current implementations target both shared-memory and distributed-memory models. In this thesis we present a proof-of-concept for portable linear solvers. We implement two algorithms from the SPIKE family using the HTA library. We show that our implementations of SPIKE exploit the abstractions provided by the HTA to produce a compact, clean code that can run on both shared-memory and distributed-memory models without modification. We discuss how we map the algorithms to HTA programs as well as examine their performance. We compare the performance of our HTA codes to comparable codes written in MPI as well as current state-of-the-art linear algebra routines.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A poster of this paper will be presented at the 25th International Conference on Parallel Architecture and Compilation Technology (PACT ’16), September 11-15, 2016, Haifa, Israel.