985 resultados para parallel computation
Resumo:
The establishment of conductive graphene-molecule-graphene junction is investigated through first-principles electronic structure calculations and quantum transport calculations. The junction consists of a conjugated molecule connecting two parallel graphene sheets. The effects of molecular electronic states, structure relaxation, and molecule-graphene contact on the conductance of the junction are explored. A conductance as large as 0.38 conductance quantum is found achievable with an appropriately oriented dithiophene bridge. This work elucidates the designing principles of promising nanoelectronic devices based on conductive graphene-molecule-graphene junctions.
Resumo:
We obtain an upper bound on the time available for quantum computation for a given quantum computer and decohering environment with quantum error correction implemented. First, we derive an explicit quantum evolution operator for the logical qubits and show that it has the same form as that for the physical qubits but with a reduced coupling strength to the environment. Using this evolution operator, we find the trace distance between the real and ideal states of the logical qubits in two cases. For a super-Ohmic bath, the trace distance saturates, while for Ohmic or sub-Ohmic baths, there is a finite time before the trace distance exceeds a value set by the user. © 2010 The American Physical Society.
Resumo:
Alewife, Alosa pseudoharengus, populations occur in two discrete life-history variants, an anadromous form and a landlocked (freshwater resident) form. Landlocked populations display a consistent pattern of life-history divergence from anadromous populations, including earlier age at maturity, smaller adult body size, and reduced fecundity. In Connecticut (USA), dams constructed on coastal streams separate anadromous spawning runs from lake-resident landlocked populations. Here, we used sequence data from the mtDNA control region and allele frequency data from five microsatellite loci to ask whether coastal Connecticut landlocked alewife populations are independently evolved from anadromous populations or whether they share a common freshwater ancestor. We then used microsatellite data to estimate the timing of the divergence between anadromous and landlocked populations. Finally, we examined anadromous and landlocked populations for divergence in foraging morphology and used divergence time estimates to calculate the rate of evolution for foraging traits. Our results indicate that landlocked populations have evolved multiple times independently. Tests of population divergence and estimates of gene flow show that landlocked populations are genetically isolated, whereas anadromous populations exchange genes. These results support a 'phylogenetic raceme' model of landlocked alewife divergence, with anadromous populations forming an ancestral core from which landlocked populations independently diverged. Divergence time estimates suggest that landlocked populations diverged from a common anadromous ancestor no longer than 5000 years ago and perhaps as recently as 300 years ago, depending on the microsatellite mutation rate assumed. Examination of foraging traits reveals landlocked populations to have significantly narrower gapes and smaller gill raker spacings than anadromous populations, suggesting that they are adapted to foraging on smaller prey items. Estimates of evolutionary rates (in haldanes) indicate rapid evolution of foraging traits, possibly in response to changes in available resources.
Resumo:
An abstract of this work will be presented at the Compiler, Architecture and Tools Conference (CATC), Intel Development Center, Haifa, Israel November 23, 2015.
Resumo:
The availability of a very accurate dependence graph for a scalar code is the basis for the automatic generation of an efficient parallel implementation. The strategy for this task which is encapsulated in a comprehensive data partitioning code generation algorithm is described. This algorithm involves the data partition, calculation of assignment ranges for partitioned arrays, addition of a comprehensive set of execution control masks, altering loop limits, addition and optimisation of communications for all data. In this context, the development and implementation of strategies to merge communications wherever possible has proved an important feature in producing efficient parallel implementations for numerical mesh based codes. The code generation strategies described here are embedded within the Computer Aided Parallelisation tools (CAPTools) software as a key part of a toolkit for automating as much as possible of the parallelisation process for mesh based numerical codes. The algorithms used enables parallelisation of real computational mechanics codes with only minor user interaction and without any prior manual customisation of the serial code to suit the parallelisation tool.
Resumo:
User supplied knowledge and interaction is a vital component of a toolkit for producing high quality parallel implementations of scalar FORTRAN numerical code. In this paper we consider the necessary components that such a parallelisation toolkit should possess to provide an effective environment to identify, extract and embed user relevant user knowledge. We also examine to what extent these facilities are available in leading parallelisation tools; in particular we discuss how these issues have been addressed in the development of the user interface of the Computer Aided Parallelisation Tools (CAPTools). The CAPTools environment has been designed to enable user exploration, interaction and insertion of user knowledge to facilitate the automatic generation of very efficient parallel code. A key issue in the user's interaction is control of the volume of information so that the user is focused on only that which is needed. User control over the level and extent of information revealed at any phase is supplied using a wide variety of filters. Another issue is the way in which information is communicated. Dependence analysis and its resulting graphs involve a lot of sophisticated rather abstract concepts unlikely to be familiar to most users of parallelising tools. As such, considerable effort has been made to communicate with the user in terms that they will understand. These features, amongst others, and their use in the parallelisation process are described and their effectiveness discussed.
Resumo:
A parallel method for the dynamic partitioning of unstructured meshes is described. The method introduces a new iterative optimization technique known as relative gain optimization which both balances the workload and attempts to minimize the interprocessor communications overhead. Experiments on a series of adaptively refined meshes indicate that the algorithm provides partitions of an equivalent or higher quality to static partitioners (which do not reuse the existing partition) and much more rapidly. Perhaps more importantly, the algorithm results in only a small fraction of the amount of data migration compared to the static partitioners.
Resumo:
For the numerical solution of the linearized Euler equations, an optimized computational scheme is considered. It is based on fully staggered (in space and time) regular meshes and on a simple mirroring procedure at the stepwise solid walls. There is no need to define ghost points into the solid ohjects that reflect the sound waves. Test results demonstrate the accuracy of the method that may be used for aeroacoustic problems with complex geometries.
Resumo:
Parallel computing is now widely used in numerical simulation, particularly for application codes based on finite difference and finite element methods. A popular and successful technique employed to parallelize such codes onto large distributed memory systems is to partition the mesh into sub-domains that are then allocated to processors. The code then executes in parallel, using the SPMD methodology, with message passing for inter-processor interactions. In order to improve the parallel efficiency of an imbalanced structured mesh CFD code, a new dynamic load balancing (DLB) strategy has been developed in which the processor partition range limits of just one of the partitioned dimensions uses non-coincidental limits, as opposed to coincidental limits. The ‘local’ partition limit change allows greater flexibility in obtaining a balanced load distribution, as the workload increase, or decrease, on a processor is no longer restricted by the ‘global’ (coincidental) limit change. The automatic implementation of this generic DLB strategy within an existing parallel code is presented in this chapter, along with some preliminary results.
Resumo:
The demands of the process of engineering design, particularly for structural integrity, have exploited computational modelling techniques and software tools for decades. Frequently, the shape of structural components or assemblies is determined to optimise the flow distribution or heat transfer characteristics, and to ensure that the structural performance in service is adequate. From the perspective of computational modelling these activities are typically separated into: • fluid flow and the associated heat transfer analysis (possibly with chemical reactions), based upon Computational Fluid Dynamics (CFD) technology • structural analysis again possibly with heat transfer, based upon finite element analysis (FEA) techniques.
Resumo:
The parallelization of an industrially important in-house computational fluid dynamics (CFD) code for calculating the airflow over complex aircraft configurations using the Euler or Navier–Stokes equations is presented. The code discussed is the flow solver module of the SAUNA CFD suite. This suite uses a novel grid system that may include block-structured hexahedral or pyramidal grids, unstructured tetrahedral grids or a hybrid combination of both. To assist in the rapid convergence to a solution, a number of convergence acceleration techniques are employed including implicit residual smoothing and a multigrid full approximation storage scheme (FAS). Key features of the parallelization approach are the use of domain decomposition and encapsulated message passing to enable the execution in parallel using a single programme multiple data (SPMD) paradigm. In the case where a hybrid grid is used, a unified grid partitioning scheme is employed to define the decomposition of the mesh. The parallel code has been tested using both structured and hybrid grids on a number of different distributed memory parallel systems and is now routinely used to perform industrial scale aeronautical simulations. Copyright © 2000 John Wiley & Sons, Ltd.