3 resultados para parallelization
em Universitätsbibliothek Kassel, Universität Kassel, Germany
Resumo:
For the theoretical investigation of local phenomena (adsorption at surfaces, defects or impurities within a crystal, etc.) one can assume that the effects caused by the local disturbance are only limited to the neighbouring particles. With this model, that is well-known as cluster-approximation, an infinite system can be simulated by a much smaller segment of the surface (Cluster). The size of this segment varies strongly for different systems. Calculations to the convergence of bond distance and binding energy of an adsorbed aluminum atom on an Al(100)-surface showed that more than 100 atoms are necessary to get a sufficient description of surface properties. However with a full-quantummechanical approach these system sizes cannot be calculated because of the effort in computer memory and processor speed. Therefore we developed an embedding procedure for the simulation of surfaces and solids, where the whole system is partitioned in several parts which itsself are treated differently: the internal part (cluster), which is located near the place of the adsorbate, is calculated completely self-consistently and is embedded into an environment, whereas the influence of the environment on the cluster enters as an additional, external potential to the relativistic Kohn-Sham-equations. The basis of the procedure represents the density functional theory. However this means that the choice of the electronic density of the environment constitutes the quality of the embedding procedure. The environment density was modelled in three different ways: atomic densities; of a large prepended calculation without embedding transferred densities; bulk-densities (copied). The embedding procedure was tested on the atomic adsorptions of 'Al on Al(100) and Cu on Cu(100). The result was that if the environment is choices appropriately for the Al-system one needs only 9 embedded atoms to reproduce the results of exact slab-calculations. For the Cu-system first calculations without embedding procedures were accomplished, with the result that already 60 atoms are sufficient as a surface-cluster. Using the embedding procedure the same values with only 25 atoms were obtained. This means a substantial improvement if one takes into consideration that the calculation time increased cubically with the number of atoms. With the embedding method Infinite systems can be treated by molecular methods. Additionally the program code was extended by the possibility to make molecular-dynamic simulations. Now it is possible apart from the past calculations of fixed cores to investigate also structures of small clusters and surfaces. A first application we made with the adsorption of Cu on Cu(100). We calculated the relaxed positions of the atoms that were located close to the adsorption site and afterwards made the full-quantummechanical calculation of this system. We did that procedure for different distances to the surface. Thus a realistic adsorption process could be examined for the first time. It should be remarked that when doing the Cu reference-calculations (without embedding) we begun to parallelize the entire program code. Only because of this aspect the investigations for the 100 atomic Cu surface-clusters were possible. Due to the good efficiency of both the parallelization and the developed embedding procedure we will be able to apply the combination in future. This will help to work on more these areas it will be possible to bring in results of full-relativistic molecular calculations, what will be very interesting especially for the regime of heavy systems.
Resumo:
The process of developing software that takes advantage of multiple processors is commonly referred to as parallel programming. For various reasons, this process is much harder than the sequential case. For decades, parallel programming has been a problem for a small niche only: engineers working on parallelizing mostly numerical applications in High Performance Computing. This has changed with the advent of multi-core processors in mainstream computer architectures. Parallel programming in our days becomes a problem for a much larger group of developers. The main objective of this thesis was to find ways to make parallel programming easier for them. Different aims were identified in order to reach the objective: research the state of the art of parallel programming today, improve the education of software developers about the topic, and provide programmers with powerful abstractions to make their work easier. To reach these aims, several key steps were taken. To start with, a survey was conducted among parallel programmers to find out about the state of the art. More than 250 people participated, yielding results about the parallel programming systems and languages in use, as well as about common problems with these systems. Furthermore, a study was conducted in university classes on parallel programming. It resulted in a list of frequently made mistakes that were analyzed and used to create a programmers' checklist to avoid them in the future. For programmers' education, an online resource was setup to collect experiences and knowledge in the field of parallel programming - called the Parawiki. Another key step in this direction was the creation of the Thinking Parallel weblog, where more than 50.000 readers to date have read essays on the topic. For the third aim (powerful abstractions), it was decided to concentrate on one parallel programming system: OpenMP. Its ease of use and high level of abstraction were the most important reasons for this decision. Two different research directions were pursued. The first one resulted in a parallel library called AthenaMP. It contains so-called generic components, derived from design patterns for parallel programming. These include functionality to enhance the locks provided by OpenMP, to perform operations on large amounts of data (data-parallel programming), and to enable the implementation of irregular algorithms using task pools. AthenaMP itself serves a triple role: the components are well-documented and can be used directly in programs, it enables developers to study the source code and learn from it, and it is possible for compiler writers to use it as a testing ground for their OpenMP compilers. The second research direction was targeted at changing the OpenMP specification to make the system more powerful. The main contributions here were a proposal to enable thread-cancellation and a proposal to avoid busy waiting. Both were implemented in a research compiler, shown to be useful in example applications, and proposed to the OpenMP Language Committee.
Resumo:
Software Defined Radio (SDR) hardware platforms use parallel architectures. Current concepts of developing applications (such as WLAN) for these platforms are complex, because developers describe an application with hardware-specifics that are relevant to parallelism such as mapping and scheduling. To reduce this complexity, we have developed a new programming approach for SDR applications, called Virtual Radio Engine (VRE). VRE defines a language for describing applications, and a tool chain that consists of a compiler kernel and other tools (such as a code generator) to generate executables. The thesis presents this concept, as well as describes the language and the compiler kernel that have been developed by the author. The language is hardware-independent, i.e., developers describe tasks and dependencies between them. The compiler kernel performs automatic parallelization, i.e., it is capable of transforming a hardware-independent program into a hardware-specific program by solving hardware-specifics, in particular mapping, scheduling and synchronizations. Thus, VRE simplifies programming tasks as developers do not solve hardware-specifics manually.