934 resultados para Simplex. CPLEXR. Parallel Efficiency. Parallel Scalability. Linear Programming


Relevância:

60.00% 60.00%

Publicador:

Resumo:

The Danish Eulerian Model (DEM) is a powerful air pollution model, designed to calculate the concentrations of various dangerous species over a large geographical region (e.g. Europe). It takes into account the main physical and chemical processes between these species, the actual meteorological conditions, emissions, etc.. This is a huge computational task and requires significant resources of storage and CPU time. Parallel computing is essential for the efficient practical use of the model. Some efficient parallel versions of the model were created over the past several years. A suitable parallel version of DEM by using the Message Passing Interface library (AIPI) was implemented on two powerful supercomputers of the EPCC - Edinburgh, available via the HPC-Europa programme for transnational access to research infrastructures in EC: a Sun Fire E15K and an IBM HPCx cluster. Although the implementation is in principal, the same for both supercomputers, few modifications had to be done for successful porting of the code on the IBM HPCx cluster. Performance analysis and parallel optimization was done next. Results from bench marking experiments will be presented in this paper. Another set of experiments was carried out in order to investigate the sensitivity of the model to variation of some chemical rate constants in the chemical submodel. Certain modifications of the code were necessary to be done in accordance with this task. The obtained results will be used for further sensitivity analysis Studies by using Monte Carlo simulation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper we consider hybrid (fast stochastic approximation and deterministic refinement) algorithms for Matrix Inversion (MI) and Solving Systems of Linear Equations (SLAE). Monte Carlo methods are used for the stochastic approximation, since it is known that they are very efficient in finding a quick rough approximation of the element or a row of the inverse matrix or finding a component of the solution vector. We show how the stochastic approximation of the MI can be combined with a deterministic refinement procedure to obtain MI with the required precision and further solve the SLAE using MI. We employ a splitting A = D – C of a given non-singular matrix A, where D is a diagonal dominant matrix and matrix C is a diagonal matrix. In our algorithm for solving SLAE and MI different choices of D can be considered in order to control the norm of matrix T = D –1C, of the resulting SLAE and to minimize the number of the Markov Chains required to reach given precision. Further we run the algorithms on a mini-Grid and investigate their efficiency depending on the granularity. Corresponding experimental results are presented.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper we introduce a new algorithm, based on the successful work of Fathi and Alexandrov, on hybrid Monte Carlo algorithms for matrix inversion and solving systems of linear algebraic equations. This algorithm consists of two parts, approximate inversion by Monte Carlo and iterative refinement using a deterministic method. Here we present a parallel hybrid Monte Carlo algorithm, which uses Monte Carlo to generate an approximate inverse and that improves the accuracy of the inverse with an iterative refinement. The new algorithm is applied efficiently to sparse non-singular matrices. When we are solving a system of linear algebraic equations, Bx = b, the inverse matrix is used to compute the solution vector x = B(-1)b. We present results that show the efficiency of the parallel hybrid Monte Carlo algorithm in the case of sparse matrices.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this article, we introduce two new variants of the Assembly Line Worker Assignment and Balancing Problem (ALWABP) that allow parallelization of and collaboration between heterogeneous workers. These new approaches suppose an additional level of complexity in the Line Design and Assignment process, but also higher flexibility; which may be particularly useful in practical situations where the aim is to progressively integrate slow or limited workers in conventional assembly lines. We present linear models and heuristic procedures for these two new problems. Computational results show the efficiency of the proposed approaches and the efficacy of the studied layouts in different situations. (C) 2012 Elsevier B.V. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Parallel phenotypic divergence in replicated adaptive radiations could either result from parallel genetic divergence in response to similar divergent selec- tion regimes or from equivalent phenotypically plastic response to the repeated occurrence of contrasting environments. In post-glacial fish, repli- cated divergence in phenotypes along the benthic-limnetic habitat axis is commonly observed. Here, we use two benthic-limnetic species pairs of whitefish from two Swiss lakes, raised in a common garden design, with reciprocal food treatments in one species pair, to experimentally measure whether feeding efficiency on benthic prey has a genetic basis or whether it underlies phenotypic plasticity (or both). To do so, we offered experimental fish mosquito larvae, partially burried in sand, and measured multiple feed- ing efficiency variables. Our results reveal both, genetic divergence as well as phenotypically plastic divergence in feeding efficiency, with the pheno- typically benthic species raised on benthic food being the most efficient forager on benthic prey. This indicates that both, divergent natural selection on genetically heritable traits and adaptive phenotypic plasticity, are likely important mechanisms driving phenotypic divergence in adaptive radiation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We show a method for parallelizing top down dynamic programs in a straightforward way by a careful choice of a lock-free shared hash table implementation and randomization of the order in which the dynamic program computes its subproblems. This generic approach is applied to dynamic programs for knapsack, shortest paths, and RNA structure alignment, as well as to a state-of-the-art solution for minimizing the máximum number of open stacks. Experimental results are provided on three different modern multicore architectures which show that this parallelization is effective and reasonably scalable. In particular, we obtain over 10 times speedup for 32 threads on the open stacks problem.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present the design and implementation of the and-parallel component of ACE. ACE is a computational model for the full Prolog language that simultaneously exploits both or-parallelism and independent and-parallelism. A high performance implementation of the ACE model has been realized and its performance reported in this paper. We discuss how some of the standard problems which appear when implementing and-parallel systems are solved in ACE. We then propose a number of optimizations aimed at reducing the overheads and the increased memory consumption which occur in such systems when using previously proposed solutions. Finally, we present results from an implementation of ACE which includes the optimizations proposed. The results show that ACE exploits and-parallelism with high efficiency and high speedups. Furthermore, they also show that the proposed optimizations, which are applicable to many other and-parallel systems, significantly decrease memory consumption and increase speedups and absolute performance both in forwards execution and during backtracking.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present a technique to estimate accurate speedups for parallel logic programs with relative independence from characteristics of a given implementation or underlying parallel hardware. The proposed technique is based on gathering accurate data describing one execution at run-time, which is fed to a simulator. Alternative schedulings are then simulated and estimates computed for the corresponding speedups. A tool implementing the aforementioned techniques is presented, and its predictions are compared to the performance of real systems, showing good correlation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Incorporating the possibility of attaching attributes to variables in a logic programming system has been shown to allow the addition of general constraint solving capabilities to it. This approach is very attractive in that by adding a few primitives any logic programming system can be turned into a generic constraint logic programming system in which constraint solving can be user deñned, and at source level - an extreme example of the "glass box" approach. In this paper we propose a different and novel use for the concept of attributed variables: developing a generic parallel/concurrent (constraint) logic programming system, using the same "glass box" flavor. We argüe that a system which implements attributed variables and a few additional primitives can be easily customized at source level to implement many of the languages and execution models of parallelism and concurrency currently proposed, in both shared memory and distributed systems. We illustrate this through examples and report on an implementation of our ideas.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Although the sequential execution speed of logic programs has been greatly improved by the concepts introduced in the Warren Abstract Machine (WAM), parallel execution represents the only way to increase this speed beyond the natural limits of sequential systems. However, most proposed parallel logic programming execution models lack the performance optimizations and storage efficiency of sequential systems. This paper presents a parallel abstract machine which is an extension of the WAM and is thus capable of supporting ANDParallelism without giving up the optimizations present in sequential implementations. A suitable instruction set, which can be used as a target by a variety of logic programming languages, is also included. Special instructions are provided to support a generalized version of "Restricted AND-Parallelism" (RAP), a technique which reduces the overhead traditionally associated with the run-time management of variable binding conflicts to a series of simple run-time checks, which select one out of a series of compiled execution graphs.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Incorporating the possibility of attaching attributes to variables in a logic programming system has been shown to allow the addition of general constraint solving capabilities to it. This approach is very attractive in that by adding a few primitives any logic programming system can be turned into a generic constraint logic programming system in which constraint solving can be user defined, and at source level - an extreme example of the "glass box" approach. In this paper we propose a different and novel use for the concept of attributed variables: developing a generic parallel/concurrent (constraint) logic programming system, using the same "glass box" flavor. We argüe that a system which implements attributed variables and a few additional primitives can be easily customized at source level to implement many of the languages and execution models of parallelism and concurrency currently proposed, in both shared memory and distributed systems. We illustrate this through examples.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Debido al gran incremento de datos digitales que ha tenido lugar en los últimos años, ha surgido un nuevo paradigma de computación paralela para el procesamiento eficiente de grandes volúmenes de datos. Muchos de los sistemas basados en este paradigma, también llamados sistemas de computación intensiva de datos, siguen el modelo de programación de Google MapReduce. La principal ventaja de los sistemas MapReduce es que se basan en la idea de enviar la computación donde residen los datos, tratando de proporcionar escalabilidad y eficiencia. En escenarios libres de fallo, estos sistemas generalmente logran buenos resultados. Sin embargo, la mayoría de escenarios donde se utilizan, se caracterizan por la existencia de fallos. Por tanto, estas plataformas suelen incorporar características de tolerancia a fallos y fiabilidad. Por otro lado, es reconocido que las mejoras en confiabilidad vienen asociadas a costes adicionales en recursos. Esto es razonable y los proveedores que ofrecen este tipo de infraestructuras son conscientes de ello. No obstante, no todos los enfoques proporcionan la misma solución de compromiso entre las capacidades de tolerancia a fallo (o de manera general, las capacidades de fiabilidad) y su coste. Esta tesis ha tratado la problemática de la coexistencia entre fiabilidad y eficiencia de los recursos en los sistemas basados en el paradigma MapReduce, a través de metodologías que introducen el mínimo coste, garantizando un nivel adecuado de fiabilidad. Para lograr esto, se ha propuesto: (i) la formalización de una abstracción de detección de fallos; (ii) una solución alternativa a los puntos únicos de fallo de estas plataformas, y, finalmente, (iii) un nuevo sistema de asignación de recursos basado en retroalimentación a nivel de contenedores. Estas contribuciones genéricas han sido evaluadas tomando como referencia la arquitectura Hadoop YARN, que, hoy en día, es la plataforma de referencia en la comunidad de los sistemas de computación intensiva de datos. En la tesis se demuestra cómo todas las contribuciones de la misma superan a Hadoop YARN tanto en fiabilidad como en eficiencia de los recursos utilizados. ABSTRACT Due to the increase of huge data volumes, a new parallel computing paradigm to process big data in an efficient way has arisen. Many of these systems, called dataintensive computing systems, follow the Google MapReduce programming model. The main advantage of these systems is based on the idea of sending the computation where the data resides, trying to provide scalability and efficiency. In failure-free scenarios, these frameworks usually achieve good results. However, these ones are not realistic scenarios. Consequently, these frameworks exhibit some fault tolerance and dependability techniques as built-in features. On the other hand, dependability improvements are known to imply additional resource costs. This is reasonable and providers offering these infrastructures are aware of this. Nevertheless, not all the approaches provide the same tradeoff between fault tolerant capabilities (or more generally, reliability capabilities) and cost. In this thesis, we have addressed the coexistence between reliability and resource efficiency in MapReduce-based systems, looking for methodologies that introduce the minimal cost and guarantee an appropriate level of reliability. In order to achieve this, we have proposed: (i) a formalization of a failure detector abstraction; (ii) an alternative solution to single points of failure of these frameworks, and finally (iii) a novel feedback-based resource allocation system at the container level. Finally, our generic contributions have been instantiated for the Hadoop YARN architecture, which is the state-of-the-art framework in the data-intensive computing systems community nowadays. The thesis demonstrates how all our approaches outperform Hadoop YARN in terms of reliability and resource efficiency.