855 resultados para Parallel Computation
Resumo:
En esta tesis se integran numéricamente las ecuaciones reducidas de Navier Stokes (RNS), que describen el flujo en una capa límite tridimensional que presenta también una escala característica espacial corta en el sentido transversal. La formulación RNS se usa para el cálculo de “streaks” no lineales de amplitud finita, y los resultados conseguidos coinciden con los existentes en la literatura, obtenidos típicamente utilizando simulación numérica directa (DNS) o nonlinear parabolized stability equations (PSE). El cálculo de los “streaks” integrando las RNS es mucho menos costoso que usando DNS, y no presenta los problemas de estabilidad que aparecen en la formulación PSE cuando la amplitud del “streak” deja de ser pequeña. El código de integración RNS se utiliza también para el cálculo de los “streaks” que aparecen de manera natural en el borde de ataque de una placa plana en ausencia de perturbaciones en la corriente uniforme exterior. Los resultados existentes hasta ahora calculaban estos “streaks” únicamente en el límite lineal (amplitud pequeña), y en esta tesis se lleva a cabo el cálculo de los mismos en el régimen completamente no lineal (amplitud finita). En la segunda parte de la tesis se generaliza el código RNS para incluir la posibilidad de tener una placa no plana, con curvatura en el sentido transversal que varía lentamente en el sentido de la corriente. Esto se consigue aplicando un cambio de coordenadas, que transforma el dominio físico en uno rectangular. La formulación RNS se integra también expresada en las correspondientes coordenadas curvilíneas. Este código generalizado RNS se utiliza finalmente para estudiar el flujo de capa límite sobre una placa con surcos que varían lentamente en el sentido de la corriente, y es usado para simular el flujo sobre surcos que crecen en tal sentido. Abstract In this thesis, the reduced Navier Stokes (RNS) equations are numerically integrated. This formulation describes the flow in a three-dimensional boundary layer that also presents a short characteristic space scale in the spanwise direction. RNS equations are used to calculate nonlinear finite amplitude “streaks”, and the results agree with those reported in the literature, typically obtained using direct numerical simulation (DNS) or nonlinear parabolized stability equations (PSE). “Streaks” simulations through the RNS integration are much cheaper than using DNS, and avoid stability problems that appear in the PSE when the amplitude of the “streak” is not small. The RNS integration code is also used to calculate the “streaks” that naturally emerge at the leading edge of a flat plate boundary layer in the absence of any free stream perturbations. Up to now, the existing results for these “streaks” have been only calculated in the linear limit (small amplitude), and in this thesis their calculation is carried out in the fully nonlinear regime (finite amplitude). In the second part of the thesis, the RNS code is generalized to include the possibility of having a non-flat plate, curved in the spanwise direction and slowly varying in the streamwise direction. This is achieved by applying a change of coordinates, which transforms the physical domain into a rectangular one. The RNS formulation expressed in the corresponding curvilinear coordinates is also numerically integrated. This generalized RNS code is finally used to study the boundary layer flow over a plate with grooves which vary slowly in the streamwise direction; and this code is used to simulate the flow over grooves that grow in the streamwise direction.
Resumo:
Effective static analyses have been proposed which allow inferring functions which bound the number of resolutions or reductions. These have the advantage of being independent from the platform on which the programs are executed and such bounds have been shown useful in a number of applications, such as granularity control in parallel execution. On the other hand, in certain distributed computation scenarios where different platforms come into play, with each platform having different capabilities, it is more interesting to express costs in metrics that include the characteristics of the platform. In particular, it is specially interesting to be able to infer upper and lower bounds on actual execution time. With this objective in mind, we propose a method which allows inferring upper and lower bounds on the execution times of procedures of a program in a given execution platform. The approach combines compile-time cost bounds analysis with a one-time profiling of the platform in order to determine the values of certain constants for that platform. These constants calibrate a cost model which from then on is able to compute statically time bound functions for procedures and to predict with a significant degree of accuracy the execution times of such procedures in the given platform. The approach has been implemented and integrated in the CiaoPP system.
Resumo:
We propose an abstract syntax for Prolog that will help the manipulation of programs at compile-time, as well as the exchange of sources and information among the tools designed for this manipulation. This includes analysers, partial evaluators, and program transformation tools. We have chosen to concentrate on the information exchange format, rather than on the syntax of programs, for which we assume a simplified format. Our purpose is to provide a low-level meeting point for the tools which will allow them to read the same programs and understand the information about them. This report describes our first design in an informal way. We expect this design to evolve and concretize, along with the future development of the tools, during the project.
Resumo:
Abstract is not available
Resumo:
We discuss several issues involved in the implementation of ACE, a model capable of exploiting both And-parallelism and Or-parallelism in Prolog in a unified framework. The Orparallel model that ACE employs is based on the idea of stack-copying developed for Muse, while the model of independent And-parallelism is based on the distributed stack approach of &-Prolog. We discuss the organization of the workers, a number of sharing assumtions, techniques for work load detection, and issues relaed to which parts need to be copied when a flexible and-scheduling strategy is used.
Resumo:
The term "Logic Programming" refers to a variety of computer languages and execution models which are based on the traditional concept of Symbolic Logic. The expressive power of these languages offers promise to be of great assistance in facing the programming challenges of present and future symbolic processing applications in Artificial Intelligence, Knowledge-based systems, and many other areas of computing. The sequential execution speed of logic programs has been greatly improved since the advent of the first interpreters. However, higher inference speeds are still required in order to meet the demands of applications such as those contemplated for next generation computer systems. The execution of logic programs in parallel is currently considered a promising strategy for attaining such inference speeds. Logic Programming in turn appears as a suitable programming paradigm for parallel architectures because of the many opportunities for parallel execution present in the implementation of logic programs. This dissertation presents an efficient parallel execution model for logic programs. The model is described from the source language level down to an "Abstract Machine" level suitable for direct implementation on existing parallel systems or for the design of special purpose parallel architectures. Few assumptions are made at the source language level and therefore the techniques developed and the general Abstract Machine design are applicable to a variety of logic (and also functional) languages. These techniques offer efficient solutions to several areas of parallel Logic Programming implementation previously considered problematic or a source of considerable overhead, such as the detection and handling of variable binding conflicts in AND-Parallelism, the specification of control and management of the execution tree, the treatment of distributed backtracking, and goal scheduling and memory management issues, etc. A parallel Abstract Machine design is offered, specifying data areas, operation, and a suitable instruction set. This design is based on extending to a parallel environment the techniques introduced by the Warren Abstract Machine, which have already made very fast and space efficient sequential systems a reality. Therefore, the model herein presented is capable of retaining sequential execution speed similar to that of high performance sequential systems, while extracting additional gains in speed by efficiently implementing parallel execution. These claims are supported by simulations of the Abstract Machine on sample programs.
Resumo:
The paper resumes the results obtained applying various implementations of the direct boundary element method (BEM) to the solution of the Laplace Equation governing the potential flow problem during everyday service manoeuvres of high-speed trains. In particular the results of train passing events at three different speed combinations are presented. Some recommendations are given in order to reduce calculation times which as is demonstrated can be cut down to not exceed reasonable limits even when using nowadays office PCs. Thus the method is shown to be a very valuable tool for the design engineer.
Resumo:
This article presents in an informal way some early results on the design of a series of paradigms for visualization of the parallel execution of logic programs. The results presented here refer to the visualization of or-parallelism, as in MUSE and Aurora, deterministic dependent and-parallelism, as in Andorra-I, and independent and-parallelism as in &-Prolog. A tool has been implemented for this purpose and has been interfaced with these systems. Results are presented showing the visualization of executions from these systems and the usefulness of the resulting tool is briefly discussed.
Resumo:
Knowing the size of the terms to which program variables are bound at run-time in logic programs is required in a class of applications related to program optimization such as, for example, granularity analysis and selection among different algorithms or control rules whose performance may be dependent on such size. Such size is difficult to even approximate at compile time and is thus generally computed at run-time by using (possibly predefined) predicates which traverse the terms involved. We propose a technique based on program transformation which has the potential of performing this computation much more efficiently. The technique is based on finding program procedures which are called before those in which knowledge regarding term sizes is needed and which traverse the terms whose size is to be determined, and transforming such procedures so that they compute term sizes "on the fly". We present a systematic way of determining whether a given program can be transformed in order to compute a given term size at a given program point without additional term traversal. Also, if several such transformations are possible our approach allows finding minimal transformations under certain criteria. We also discuss the advantages and applications of our technique and present some performance results.
Resumo:
Knowing the size of the terms to which program variables are bound at run-time in logic programs is required in a class of applications related to program optimization such as, for example, recursion elimination and granularity analysis. Such size is difficult to even approximate at compile time and is thus generally computed at run-time by using (possibly predefined) predicates which traverse the terms involved. We propose a technique based on program transformation which has the potential of performing this computation much more efficiently. The technique is based on finding program procedures which are called before those in which knowledge regarding term sizes is needed and which traverse the terms whose size is to be determined, and transforming such procedures so that they compute term sizes "on the fly". We present a systematic way of determining whether a given program can be transformed in order to compute a given term size at a given program point without additional term traversal. Also, if several such transformations are possible our approach allows finding minimal transformations under certain criteria. We also discuss the advantages and present some applications of our technique.
Resumo:
This paper presents an approximation to the study of parallel systems using sequential tools. The Independent And-parallelism in Prolog is an example of parallel processing paradigm in the framework of logic programming, and implementations like
Resumo:
Bruynooghe described a framework for the top-down abstract interpretation of logic programs. In this framework, abstract interpretation is carried out by constructing an abstract and-or tree in a top-down fashion for a given query and program. Such an abstract interpreter requires fixpoint computation for programs which contain recursive predicates. This paper presents in detail a fixpoint algorithm that has been developed for this purpose and the motivation behind it. We start off by describing a simple-minded algorithm. After pointing out its shortcomings, we present a series of refinements to this algorithm, until we reach the final version. The aim is to give an intuitive grasp and provide justification for the relative complexity of the final algorithm. We also present an informal proof of correctness of the algorithm and some results obtained from an implementation.
Resumo:
This paper presents an approximation to the study of parallel systems using sequential tools. The Independent And-parallelism in Prolog is an example of parallel processing paradigm in the framework of logic programming, and implementations like
Resumo:
In this work, the dimensional synthesis of a spherical Parallel Manipulator (PM) with a -1S kinematic chain is presented. The goal of the synthesis is to find a set of parameters that defines the PM with the best performance in terms of workspace capabilities, dexterity and isotropy. The PM is parametrized in terms of a reference element, and a non-directed search of these parameters is carried out. First, the inverse kinematics and instantaneous kinematics of the mechanism are presented. The latter is found using the screw theory formulation. An algorithm that explores a bounded set of parameters and determines the corresponding value of global indexes is presented. The concepts of a novel global performance index and a compound index are introduced. Simulation results are shown and discussed. The best PMs found in terms of each performance index evaluated are locally analyzed in terms of its workspace and local dexterity. The relationship between the performance of the PM and its parameters is discussed, and a prototype with the best performance in terms of the compound index is presented and analyzed.