938 resultados para PDE-based parallel preconditioner
Resumo:
Recent research in multi-agent systems incorporate fault tolerance concepts. However, the research does not explore the extension and implementation of such ideas for large scale parallel computing systems. The work reported in this paper investigates a swarm array computing approach, namely ‘Intelligent Agents’. In the approach considered a task to be executed on a parallel computing system is decomposed to sub-tasks and mapped onto agents that traverse an abstracted hardware layer. The agents intercommunicate across processors to share information during the event of a predicted core/processor failure and for successfully completing the task. The agents hence contribute towards fault tolerance and towards building reliable systems. The feasibility of the approach is validated by simulations on an FPGA using a multi-agent simulator and implementation of a parallel reduction algorithm on a computer cluster using the Message Passing Interface.
Resumo:
Application of optimization algorithm to PDE modeling groundwater remediation can greatly reduce remediation cost. However, groundwater remediation analysis requires a computational expensive simulation, therefore, effective parallel optimization could potentially greatly reduce computational expense. The optimization algorithm used in this research is Parallel Stochastic radial basis function. This is designed for global optimization of computationally expensive functions with multiple local optima and it does not require derivatives. In each iteration of the algorithm, an RBF is updated based on all the evaluated points in order to approximate expensive function. Then the new RBF surface is used to generate the next set of points, which will be distributed to multiple processors for evaluation. The criteria of selection of next function evaluation points are estimated function value and distance from all the points known. Algorithms created for serial computing are not necessarily efficient in parallel so Parallel Stochastic RBF is different algorithm from its serial ancestor. The application for two Groundwater Superfund Remediation sites, Umatilla Chemical Depot, and Former Blaine Naval Ammunition Depot. In the study, the formulation adopted treats pumping rates as decision variables in order to remove plume of contaminated groundwater. Groundwater flow and contamination transport is simulated with MODFLOW-MT3DMS. For both problems, computation takes a large amount of CPU time, especially for Blaine problem, which requires nearly fifty minutes for a simulation for a single set of decision variables. Thus, efficient algorithm and powerful computing resource are essential in both cases. The results are discussed in terms of parallel computing metrics i.e. speedup and efficiency. We find that with use of up to 24 parallel processors, the results of the parallel Stochastic RBF algorithm are excellent with speed up efficiencies close to or exceeding 100%.
Resumo:
In this paper, the use of differential evolution ( DE), a global search technique inspired by evolutionary theory, to find the parameters that are required to achieve optimum dynamic response of parallel operation of inverters with no interconnection among the controllers is proposed. Basically, in order to reach such a goal, the system is modeled in a certain way that the slopes of P-omega and Q-V curves are the parameters to be tuned. Such parameters, when properly tuned, result in system's eigenvalues located in positions that assure the system's stability and oscillation-free dynamic response with minimum settling time. This paper describes the modeling approach and provides an overview of the motivation for the optimization and a description of the DE technique. Simulation and experimental results are also presented, and they show the viability of the proposed method.
Resumo:
In this work it is proposed an optimized dynamic response of parallel operation of two single-phase inverters with no control communication. The optimization aims the tuning of the slopes of P-ω and Q-V curves so that the system is stable, damped and minimum settling time. The slopes are tuned using an algorithm based on evolutionary theory. Simulation and experimental results are presented to prove the feasibility of the proposed approach. © 2010 IEEE.
Resumo:
In the past two decades the work of a growing portion of researchers in robotics focused on a particular group of machines, belonging to the family of parallel manipulators: the cable robots. Although these robots share several theoretical elements with the better known parallel robots, they still present completely (or partly) unsolved issues. In particular, the study of their kinematic, already a difficult subject for conventional parallel manipulators, is further complicated by the non-linear nature of cables, which can exert only efforts of pure traction. The work presented in this thesis therefore focuses on the study of the kinematics of these robots and on the development of numerical techniques able to address some of the problems related to it. Most of the work is focused on the development of an interval-analysis based procedure for the solution of the direct geometric problem of a generic cable manipulator. This technique, as well as allowing for a rapid solution of the problem, also guarantees the results obtained against rounding and elimination errors and can take into account any uncertainties in the model of the problem. The developed code has been tested with the help of a small manipulator whose realization is described in this dissertation together with the auxiliary work done during its design and simulation phases.
Resumo:
Stepwise uncertainty reduction (SUR) strategies aim at constructing a sequence of points for evaluating a function f in such a way that the residual uncertainty about a quantity of interest progressively decreases to zero. Using such strategies in the framework of Gaussian process modeling has been shown to be efficient for estimating the volume of excursion of f above a fixed threshold. However, SUR strategies remain cumbersome to use in practice because of their high computational complexity, and the fact that they deliver a single point at each iteration. In this article we introduce several multipoint sampling criteria, allowing the selection of batches of points at which f can be evaluated in parallel. Such criteria are of particular interest when f is costly to evaluate and several CPUs are simultaneously available. We also manage to drastically reduce the computational cost of these strategies through the use of closed form formulas. We illustrate their performances in various numerical experiments, including a nuclear safety test case. Basic notions about kriging, auxiliary problems, complexity calculations, R code, and data are available online as supplementary materials.
Resumo:
We argüe that in order to exploit both Independent And- and Or-parallelism in Prolog programs there is advantage in recomputing some of the independent goals, as opposed to all their solutions being reused. We present an abstract model, called the Composition-Tree, for representing and-or parallelism in Prolog Programs. The Composition-tree closely mirrors sequential Prolog execution by recomputing some independent goals rather than fully re-using them. We also outline two environment representation techniques for And-Or parallel execution of full Prolog based on the Composition-tree model abstraction. We argüe that these techniques have advantages over earlier proposals for exploiting and-or parallelism in Prolog.
Resumo:
In this paper we present a novel execution model for parallel implementation of logic programs which is capable of exploiting both independent and-parallelism and or-parallelism in an efficient way. This model extends the stack copying approach, which has been successfully applied in the Muse system to implement or-parallelism, by integrating it with proven techniques used to support independent and-parallelism. We show how all solutions to non-deterministic andparallel goals are found without repetitions. This is done through recomputation as in Prolog (and in various and-parallel systems, like &-Prolog and DDAS), i.e., solutions of and-parallel goals are not shared. We propose a scheme for the efficient management of the address space in a way that is compatible with the apparently incompatible requirements of both and- and or-parallelism. We also show how the full Prolog language, with all its extra-logical features, can be supported in our and-or parallel system so that its sequential semantics is preserved. The resulting system retains the advantages of both purely or-parallel systems as well as purely and-parallel systems. The stack copying scheme together with our proposed memory management scheme can also be used to implement models that combine dependent and-parallelism and or-parallelism, such as Andorra and Prometheus.
Resumo:
We argüe that in order to exploit both Independent And- and Or-parallelism in Prolog programs there is advantage in recomputing some of the independent goals, as opposed to all their solutions being reused. We present an abstract model, called the Composition- Tree, for representing and-or parallelism in Prolog Programs. The Composition-tree closely mirrors sequential Prolog execution by recomputing some independent goals rather than fully re-using them. We also outline two environment representation techniques for And-Or parallel execution of full Prolog based on the Composition-tree model abstraction. We argüe that these techniques have advantages over earlier proposals for exploiting and-or parallelism in Prolog.
Resumo:
We discuss several issues involved in the implementation of ACE, a model capable of exploiting both And-parallelism and Or-parallelism in Prolog in a unified framework. The Orparallel model that ACE employs is based on the idea of stack-copying developed for Muse, while the model of independent And-parallelism is based on the distributed stack approach of &-Prolog. We discuss the organization of the workers, a number of sharing assumtions, techniques for work load detection, and issues relaed to which parts need to be copied when a flexible and-scheduling strategy is used.