855 resultados para Embarrassingly Parallel


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a dynamic distributed load balancing algorithm for parallel, adaptive finite element simulations using preconditioned conjugate gradient solvers based on domain-decomposition. The load balancer is designed to maintain good partition aspect ratios. It calculates a balancing flow using different versions of diffusion and a variant of breadth first search. Elements to be migrated are chosen according to a cost function aiming at the optimization of subdomain shapes. We show how to use information from the second step to guide the first. Experimental results using Bramble's preconditioner and comparisons to existing state-of-the-art balancers show the benefits of the construction.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

As the complexity of parallel applications increase, the performance limitations resulting from computational load imbalance become dominant. Mapping the problem space to the processors in a parallel machine in a manner that balances the workload of each processors will typically reduce the run-time. In many cases the computation time required for a given calculation cannot be predetermined even at run-time and so static partition of the problem returns poor performance. For problems in which the computational load across the discretisation is dynamic and inhomogeneous, for example multi-physics problems involving fluid and solid mechanics with phase changes, the workload for a static subdomain will change over the course of a computation and cannot be estimated beforehand. For such applications the mapping of loads to process is required to change dynamically, at run-time in order to maintain reasonable efficiency. The issue of dynamic load balancing are examined in the context of PHYSICA, a three dimensional unstructured mesh multi-physics continuum mechanics computational modelling code.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A method is outlined for optimising graph partitions which arise in mapping unstructured mesh calculations to parallel computers. The method employs a relative gain iterative technique to both evenly balance the workload and minimise the number and volume of interprocessor communications. A parallel graph reduction technique is also briefly described and can be used to give a global perspective to the optimisation. The algorithms work efficiently in parallel as well as sequentially and when combined with a fast direct partitioning technique (such as the Greedy algorithm) to give an initial partition, the resulting two-stage process proves itself to be both a powerful and flexible solution to the static graph-partitioning problem. Experiments indicate that the resulting parallel code can provide high quality partitions, independent of the initial partition, within a few seconds. The algorithms can also be used for dynamic load-balancing, reusing existing partitions and in this case the procedures are much faster than static techniques, provide partitions of similar or higher quality and, in comparison, involve the migration of a fraction of the data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A parallel method for the dynamic partitioning of unstructured meshes is described. The method introduces a new iterative optimisation technique known as relative gain optimisation which both balances the workload and attempts to minimise the interprocessor communications overhead. Experiments on a series of adaptively refined meshes indicate that the algorithm provides partitions of an equivalent or higher quality to static partitioners (which do not reuse the existing partition) and much more rapidly. Perhaps more importantly, the algorithm results in only a small fraction of the amount of data migration compared to the static partitioners.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The central product of the DRAMA (Dynamic Re-Allocation of Meshes for parallel Finite Element Applications) project is a library comprising a variety of tools for dynamic re-partitioning of unstructured Finite Element (FE) applications. The input to the DRAMA library is the computational mesh, and corresponding costs, partitioned into sub-domains. The core library functions then perform a parallel computation of a mesh re-allocation that will re-balance the costs based on the DRAMA cost model. We discuss the basic features of this cost model, which allows a general approach to load identification, modelling and imbalance minimisation. Results from crash simulations are presented which show the necessity for multi-phase/multi-constraint partitioning components.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Parallel Ruptures: Jews of Bessarabia and Transnistria between Romanian Nationalism and Soviet Communism, 1918-1940,” explores the political and social debates that took place in Jewish communities in Romanian-held Bessarabia and the Moldovan Autonomous Soviet Socialist Republic during the interwar era. Both had been part of the Russian Pale of Settlement until its dissolution in 1917; they were then divided by the Romanian Army’s occupation of Bessarabia in 1918 with the establishment of a well-guarded border along the Dniester River between two newly-formed states, Greater Romania and the Soviet Union. At its core, the project focuses in comparative context on the traumatic and multi-faceted confrontation with these two modernizing states: exclusion, discrimination and growing violence in Bessarabia; destruction of religious tradition, agricultural resettlement, and socialist re-education and assimilation in Soviet Transnistria. It examines also the similarities in both states’ striving to create model subjects usable by the homeland, as well as commonalities within Jewish responses on both sides of the border. Contacts between Jews on either side of the border remained significant after 1918 despite the efforts of both states to curb them, thereby necessitating a transnational view in order to examine Jewish political and social life in borderland regions. The desire among Jewish secular leaders to mold their co-religionists into modern Jews reached across state borders and ideological divides and sought to manipulate respective governments to establish these goals, however unsuccessful in the final analysis. Finally, strained relations between Jews in peripheral borderlands with those at national/imperial cores, Moscow and Bucharest, sheds light on the complex circumstances surrounding the inclusion versus exclusion debates at the heart of all interwar European states and the complicated negotiations that took place within all minority communities that responded to state policies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Scientific applications rely heavily on floating point data types. Floating point operations are complex and require complicated hardware that is both area and power intensive. The emergence of massively parallel architectures like Rigel creates new challenges and poses new questions with respect to floating point support. The massively parallel aspect of Rigel places great emphasis on area efficient, low power designs. At the same time, Rigel is a general purpose accelerator and must provide high performance for a wide class of applications. This thesis presents an analysis of various floating point unit (FPU) components with respect to Rigel, and attempts to present a candidate design of an FPU that balances performance, area, and power and is suitable for massively parallel architectures like Rigel.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Small-colony variants (SCVs) are commonly observed in evolution experiments and clinical isolates, being associated with antibiotic resistance and persistent infections. We recently observed the repeated emergence of Escherichia coli SCVs during adaptation to the interaction with macrophages. To identify the genetic targets underlying the emergence of this clinically relevant morphotype, we performed whole-genome sequencing of independently evolved SCV clones. We uncovered novel mutational targets, not previously associated with SCVs (e.g. cydA, pepP) and observed widespread functional parallelism. All SCV clones had mutations in genes related to the electron-transport chain. As SCVs emerged during adaptation to macrophages, and often show increased antibiotic resistance, we measured SCV fitness inside macrophages and measured their antibiotic resistance profiles. SCVs had a fitness advantage inside macrophages and showed increased aminoglycoside resistance in vitro, but had collateral sensitivity to other antibiotics (e.g. tetracycline). Importantly, we observed similar results in vivo. SCVs had a fitness advantage upon colonization of the mouse gut, which could be tuned by antibiotic treatment: kanamycin (aminoglycoside) increased SCV fitness, but tetracycline strongly reduced it. Our results highlight the power of using experimental evolution as the basis for identifying the causes and consequences of adaptation during host-microbe interactions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Solving linear systems is an important problem for scientific computing. Exploiting parallelism is essential for solving complex systems, and this traditionally involves writing parallel algorithms on top of a library such as MPI. The SPIKE family of algorithms is one well-known example of a parallel solver for linear systems. The Hierarchically Tiled Array data type extends traditional data-parallel array operations with explicit tiling and allows programmers to directly manipulate tiles. The tiles of the HTA data type map naturally to the block nature of many numeric computations, including the SPIKE family of algorithms. The higher level of abstraction of the HTA enables the same program to be portable across different platforms. Current implementations target both shared-memory and distributed-memory models. In this thesis we present a proof-of-concept for portable linear solvers. We implement two algorithms from the SPIKE family using the HTA library. We show that our implementations of SPIKE exploit the abstractions provided by the HTA to produce a compact, clean code that can run on both shared-memory and distributed-memory models without modification. We discuss how we map the algorithms to HTA programs as well as examine their performance. We compare the performance of our HTA codes to comparable codes written in MPI as well as current state-of-the-art linear algebra routines.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The evolution of wireless communication systems leads to Dynamic Spectrum Allocation for Cognitive Radio, which requires reliable spectrum sensing techniques. Among the spectrum sensing methods proposed in the literature, those that exploit cyclostationary characteristics of radio signals are particularly suitable for communication environments with low signal-to-noise ratios, or with non-stationary noise. However, such methods have high computational complexity that directly raises the power consumption of devices which often have very stringent low-power requirements. We propose a strategy for cyclostationary spectrum sensing with reduced energy consumption. This strategy is based on the principle that p processors working at slower frequencies consume less power than a single processor for the same execution time. We devise a strict relation between the energy savings and common parallel system metrics. The results of simulations show that our strategy promises very significant savings in actual devices.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Vertebrate genomes are organised into a variety of nuclear environments and chromatin states that have profound effects on the regulation of gene transcription. This variation presents a major challenge to the expression of transgenes for experimental research, genetic therapies and the production of biopharmaceuticals. The majority of transgenes succumb to transcriptional silencing by their chromosomal environment when they are randomly integrated into the genome, a phenomenon known as chromosomal position effect (CPE). It is not always feasible to target transgene integration to transcriptionally permissive “safe harbour” loci that favour transgene expression, so there remains an unmet need to identify gene regulatory elements that can be added to transgenes which protect them against CPE. Dominant regulatory elements (DREs) with chromatin barrier (or boundary) activity have been shown to protect transgenes from CPE. The HS4 element from the chicken beta-globin locus and the A2UCOE element from a human housekeeping gene locus have been shown to function as DRE barriers in a wide variety of cell types and species. Despite rapid advances in the profiling of transcription factor binding, chromatin states and chromosomal looping interactions, progress towards functionally validating the many candidate barrier elements in vertebrates has been very slow. This is largely due to the lack of a tractable and efficient assay for chromatin barrier activity. In this study, I have developed the RGBarrier assay system to test the chromatin barrier activity of candidate DREs at pre-defined isogenic loci in human cells. The RGBarrier assay consists in a Flp-based RMCE reaction for the integration of an expression construct, carrying candidate DREs, in a pre-characterised chromosomal location. The RGBarrier system involves the tracking of red, green and blue fluorescent proteins by flow cytometry to monitor on-target versus off-target integration and transgene expression. The analysis of the reporter (GFP) expression for several weeks gives a measure of the protective ability of each candidate elements from chromosomal silencing. This assay can be scaled up to test tens of new putative barrier elements in the same chromosomal context in parallel. The defined chromosomal contexts of the RGBarrier assays will allow for detailed mechanistic studies of chromosomal silencing and DRE barrier element action. Understanding these mechanisms will be of paramount importance for the design of specific solutions for overcoming chromosomal silencing in specific transgenic applications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A poster of this paper will be presented at the 25th International Conference on Parallel Architecture and Compilation Technology (PACT ’16), September 11-15, 2016, Haifa, Israel.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Virtual Screening (VS) methods can considerably aid clinical research, predicting how ligands interact with drug targets. Most VS methods suppose a unique binding site for the target, but it has been demonstrated that diverse ligands interact with unrelated parts of the target and many VS methods do not take into account this relevant fact. This problem is circumvented by a novel VS methodology named BINDSURF that scans the whole protein surface to find new hotspots, where ligands might potentially interact with, and which is implemented in massively parallel Graphics Processing Units, allowing fast processing of large ligand databases. BINDSURF can thus be used in drug discovery, drug design, drug repurposing and therefore helps considerably in clinical research. However, the accuracy of most VS methods is constrained by limitations in the scoring function that describes biomolecular interactions, and even nowadays these uncertainties are not completely understood. In order to solve this problem, we propose a novel approach where neural networks are trained with databases of known active (drugs) and inactive compounds, and later used to improve VS predictions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In Brazil, human and canine visceral leishmaniasis (CVL) caused by Leishmania infantum has undergone urbanisation since 1980, constituting a public health problem, and serological tests are tools of choice for identifying infected dogs. Until recently, the Brazilian zoonoses control program recommended enzyme-linked immunosorbent assays (ELISA) and indirect immunofluorescence assays (IFA) as the screening and confirmatory methods, respectively, for the detection of canine infection. The purpose of this study was to estimate the accuracy of ELISA and IFA in parallel or serial combinations. The reference standard comprised the results of direct visualisation of parasites in histological sections, immunohistochemical test, or isolation of the parasite in culture. Samples from 98 cases and 1,327 noncases were included. Individually, both tests presented sensitivity of 91.8% and 90.8%, and specificity of 83.4 and 53.4%, for the ELISA and IFA, respectively. When tests were used in parallel combination, sensitivity attained 99.2%, while specificity dropped to 44.8%. When used in serial combination (ELISA followed by IFA), decreased sensitivity (83.3%) and increased specificity (92.5%) were observed. Serial testing approach improved specificity with moderate loss in sensitivity. This strategy could partially fulfill the needs of public health and dog owners for a more accurate diagnosis of CVL.