95 resultados para Massively parallel sequencing
em CentAUR: Central Archive University of Reading - UK
Resumo:
In the 1990s the Message Passing Interface Forum defined MPI bindings for Fortran, C, and C++. With the success of MPI these relatively conservative languages have continued to dominate in the parallel computing community. There are compelling arguments in favour of more modern languages like Java. These include portability, better runtime error checking, modularity, and multi-threading. But these arguments have not converted many HPC programmers, perhaps due to the scarcity of full-scale scientific Java codes, and the lack of evidence for performance competitive with C or Fortran. This paper tries to redress this situation by porting two scientific applications to Java. Both of these applications are parallelized using our thread-safe Java messaging system—MPJ Express. The first application is the Gadget-2 code, which is a massively parallel structure formation code for cosmological simulations. The second application uses the finite-domain time-difference method for simulations in the area of computational electromagnetics. We evaluate and compare the performance of the Java and C versions of these two scientific applications, and demonstrate that the Java codes can achieve performance comparable with legacy applications written in conventional HPC languages. Copyright © 2009 John Wiley & Sons, Ltd.
Resumo:
Although climate models have been improving in accuracy and efficiency over the past few decades, it now seems that these incremental improvements may be slowing. As tera/petascale computing becomes massively parallel, our legacy codes are less suitable, and even with the increased resolution that we are now beginning to use, these models cannot represent the multiscale nature of the climate system. This paper argues that it may be time to reconsider the use of adaptive mesh refinement for weather and climate forecasting in order to achieve good scaling and representation of the wide range of spatial scales in the atmosphere and ocean. Furthermore, the challenge of introducing living organisms and human responses into climate system models is only just beginning to be tackled. We do not yet have a clear framework in which to approach the problem, but it is likely to cover such a huge number of different scales and processes that radically different methods may have to be considered. The challenges of multiscale modelling and petascale computing provide an opportunity to consider a fresh approach to numerical modelling of the climate (or Earth) system, which takes advantage of the computational fluid dynamics developments in other fields and brings new perspectives on how to incorporate Earth system processes. This paper reviews some of the current issues in climate (and, by implication, Earth) system modelling, and asks the question whether a new generation of models is needed to tackle these problems.
Resumo:
Gadget-2 is a massively parallel structure formation code for cosmological simulations. In this paper, we present a Java version of Gadget-2. We evaluated the performance of the Java version by running colliding galaxies simulation and found that it can achieve around 70% of C Gadget-2's performance.
Resumo:
This paper makes a contribution in bridging the theory and practice of the polyhedral model for designing parallel algorithms. Although the theory of polyhedral model is well developed, designers of massively parallel algorithms are unable to benefit from the theory due to the lack of software tools that incorporate the wide range of transformations that are possible in the model. The Uniformization tool that we developed was the first to integrate a number of techniques and to completely automate the transformation step allowing designers to explore a wide range of feasible designs from high-level specifications.
Resumo:
Since its introduction in 1993, the Message Passing Interface (MPI) has become a de facto standard for writing High Performance Computing (HPC) applications on clusters and Massively Parallel Processors (MPPs). The recent emergence of multi-core processor systems presents a new challenge for established parallel programming paradigms, including those based on MPI. This paper presents a new Java messaging system called MPJ Express. Using this system, we exploit multiple levels of parallelism - messaging and threading - to improve application performance on multi-core processors. We refer to our approach as nested parallelism. This MPI-like Java library can support nested parallelism by using Java or Java OpenMP (JOMP) threads within an MPJ Express process. Practicality of this approach is assessed by porting to Java a massively parallel structure formation code from Cosmology called Gadget-2. We introduce nested parallelism in the Java version of the simulation code and report good speed-ups. To the best of our knowledge it is the first time this kind of hybrid parallelism is demonstrated in a high performance Java application. (C) 2009 Elsevier Inc. All rights reserved.
Resumo:
With the transition to multicore processors almost complete, the parallel processing community is seeking efficient ways to port legacy message passing applications on shared memory and multicore processors. MPJ Express is our reference implementation of Message Passing Interface (MPI)-like bindings for the Java language. Starting with the current release, the MPJ Express software can be configured in two modes: the multicore and the cluster mode. In the multicore mode, parallel Java applications execute on shared memory or multicore processors. In the cluster mode, Java applications parallelized using MPJ Express can be executed on distributed memory platforms like compute clusters and clouds. The multicore device has been implemented using Java threads in order to satisfy two main design goals of portability and performance. We also discuss the challenges of integrating the multicore device in the MPJ Express software. This turned out to be a challenging task because the parallel application executes in a single JVM in the multicore mode. On the contrary in the cluster mode, the parallel user application executes in multiple JVMs. Due to these inherent architectural differences between the two modes, the MPJ Express runtime is modified to ensure correct semantics of the parallel program. Towards the end, we compare performance of MPJ Express (multicore mode) with other C and Java message passing libraries---including mpiJava, MPJ/Ibis, MPICH2, MPJ Express (cluster mode)---on shared memory and multicore processors. We found out that MPJ Express performs signicantly better in the multicore mode than in the cluster mode. Not only this but the MPJ Express software also performs better in comparison to other Java messaging libraries including mpiJava and MPJ/Ibis when used in the multicore mode on shared memory or multicore processors. We also demonstrate effectiveness of the MPJ Express multicore device in Gadget-2, which is a massively parallel astrophysics N-body siimulation code.
Resumo:
Monitoring Earth's terrestrial water conditions is critically important to many hydrological applications such as global food production; assessing water resources sustainability; and flood, drought, and climate change prediction. These needs have motivated the development of pilot monitoring and prediction systems for terrestrial hydrologic and vegetative states, but to date only at the rather coarse spatial resolutions (∼10–100 km) over continental to global domains. Adequately addressing critical water cycle science questions and applications requires systems that are implemented globally at much higher resolutions, on the order of 1 km, resolutions referred to as hyperresolution in the context of global land surface models. This opinion paper sets forth the needs and benefits for a system that would monitor and predict the Earth's terrestrial water, energy, and biogeochemical cycles. We discuss six major challenges in developing a system: improved representation of surface‐subsurface interactions due to fine‐scale topography and vegetation; improved representation of land‐atmospheric interactions and resulting spatial information on soil moisture and evapotranspiration; inclusion of water quality as part of the biogeochemical cycle; representation of human impacts from water management; utilizing massively parallel computer systems and recent computational advances in solving hyperresolution models that will have up to 109 unknowns; and developing the required in situ and remote sensing global data sets. We deem the development of a global hyperresolution model for monitoring the terrestrial water, energy, and biogeochemical cycles a “grand challenge” to the community, and we call upon the international hydrologic community and the hydrological science support infrastructure to endorse the effort.
Resumo:
The DNA G-qadruplexes are one of the targets being actively explored for anti-cancer therapy by inhibiting them through small molecules. This computational study was conducted to predict the binding strengths and orientations of a set of novel dimethyl-amino-ethyl-acridine (DACA) analogues that are designed and synthesized in our laboratory, but did not diffract in Synchrotron light.Thecrystal structure of DNA G-Quadruplex(TGGGGT)4(PDB: 1O0K) was used as target for their binding properties in our studies.We used both the force field (FF) and QM/MM derived atomic charge schemes simultaneously for comparing the predictions of drug binding modes and their energetics. This study evaluates the comparative performance of fixed point charge based Glide XP docking and the quantum polarized ligand docking schemes. These results will provide insights on the effects of including or ignoring the drug-receptor interfacial polarization events in molecular docking simulations, which in turn, will aid the rational selection of computational methods at different levels of theory in future drug design programs. Plenty of molecular modelling tools and methods currently exist for modelling drug-receptor or protein-protein, or DNA-protein interactionssat different levels of complexities.Yet, the capasity of such tools to describevarious physico-chemical propertiesmore accuratelyis the next step ahead in currentresearch.Especially, the usage of most accurate methods in quantum mechanics(QM) is severely restricted by theirtedious nature. Though the usage of massively parallel super computing environments resulted in a tremendous improvement in molecular mechanics (MM) calculations like molecular dynamics,they are still capable of dealing with only a couple of tens to hundreds of atoms for QM methods. One such efficient strategy that utilizes thepowers of both MM and QM are the QM/MM hybrid methods. Lately, attempts have been directed towards the goal of deploying several different QM methods for betterment of force field based simulations, but with practical restrictions in place. One of such methods utilizes the inclusion of charge polarization events at the drug-receptor interface, that is not explicitly present in the MM FF.
Resumo:
The K-Means algorithm for cluster analysis is one of the most influential and popular data mining methods. Its straightforward parallel formulation is well suited for distributed memory systems with reliable interconnection networks, such as massively parallel processors and clusters of workstations. However, in large-scale geographically distributed systems the straightforward parallel algorithm can be rendered useless by a single communication failure or high latency in communication paths. The lack of scalable and fault tolerant global communication and synchronisation methods in large-scale systems has hindered the adoption of the K-Means algorithm for applications in large networked systems such as wireless sensor networks, peer-to-peer systems and mobile ad hoc networks. This work proposes a fully distributed K-Means algorithm (EpidemicK-Means) which does not require global communication and is intrinsically fault tolerant. The proposed distributed K-Means algorithm provides a clustering solution which can approximate the solution of an ideal centralised algorithm over the aggregated data as closely as desired. A comparative performance analysis is carried out against the state of the art sampling methods and shows that the proposed method overcomes the limitations of the sampling-based approaches for skewed clusters distributions. The experimental analysis confirms that the proposed algorithm is very accurate and fault tolerant under unreliable network conditions (message loss and node failures) and is suitable for asynchronous networks of very large and extreme scale.
Resumo:
Clustering is defined as the grouping of similar items in a set, and is an important process within the field of data mining. As the amount of data for various applications continues to increase, in terms of its size and dimensionality, it is necessary to have efficient clustering methods. A popular clustering algorithm is K-Means, which adopts a greedy approach to produce a set of K-clusters with associated centres of mass, and uses a squared error distortion measure to determine convergence. Methods for improving the efficiency of K-Means have been largely explored in two main directions. The amount of computation can be significantly reduced by adopting a more efficient data structure, notably a multi-dimensional binary search tree (KD-Tree) to store either centroids or data points. A second direction is parallel processing, where data and computation loads are distributed over many processing nodes. However, little work has been done to provide a parallel formulation of the efficient sequential techniques based on KD-Trees. Such approaches are expected to have an irregular distribution of computation load and can suffer from load imbalance. This issue has so far limited the adoption of these efficient K-Means techniques in parallel computational environments. In this work, we provide a parallel formulation for the KD-Tree based K-Means algorithm and address its load balancing issues.
Resumo:
An eddy current testing system consists of a multi-sensor probe, a computer and a special expansion card and software for data-collection and analysis. The probe incorporates an excitation coil, and sensor coils; at least one sensor coil is a lateral current-normal coil and at least one is a current perturbation coil.
Resumo:
An eddy current testing system consists of a multi-sensor probe, computer and a special expansion card and software for data collection and analysis. The probe incorporates an excitation coil, and sensor coils; at least one sensor coil is a lateral current-normal coil and at least one is a current perturbation coil.
Resumo:
One among the most influential and popular data mining methods is the k-Means algorithm for cluster analysis. Techniques for improving the efficiency of k-Means have been largely explored in two main directions. The amount of computation can be significantly reduced by adopting geometrical constraints and an efficient data structure, notably a multidimensional binary search tree (KD-Tree). These techniques allow to reduce the number of distance computations the algorithm performs at each iteration. A second direction is parallel processing, where data and computation loads are distributed over many processing nodes. However, little work has been done to provide a parallel formulation of the efficient sequential techniques based on KD-Trees. Such approaches are expected to have an irregular distribution of computation load and can suffer from load imbalance. This issue has so far limited the adoption of these efficient k-Means variants in parallel computing environments. In this work, we provide a parallel formulation of the KD-Tree based k-Means algorithm for distributed memory systems and address its load balancing issue. Three solutions have been developed and tested. Two approaches are based on a static partitioning of the data set and a third solution incorporates a dynamic load balancing policy.
Resumo:
This paper presents the results of the application of a parallel Genetic Algorithm (GA) in order to design a Fuzzy Proportional Integral (FPI) controller for active queue management on Internet routers. The Active Queue Management (AQM) policies are those policies of router queue management that allow the detection of network congestion, the notification of such occurrences to the hosts on the network borders, and the adoption of a suitable control policy. Two different parallel implementations of the genetic algorithm are adopted to determine an optimal configuration of the FPI controller parameters. Finally, the results of several experiments carried out on a forty nodes cluster of workstations are presented.
Resumo:
This paper presents a parallel genetic algorithm to the Steiner Problem in Networks. Several previous papers have proposed the adoption of GAs and others metaheuristics to solve the SPN demonstrating the validity of their approaches. This work differs from them for two main reasons: the dimension and the characteristics of the networks adopted in the experiments and the aim from which it has been originated. The reason that aimed this work was namely to build a comparison term for validating deterministic and computationally inexpensive algorithms which can be used in practical engineering applications, such as the multicast transmission in the Internet. On the other hand, the large dimensions of our sample networks require the adoption of a parallel implementation of the Steiner GA, which is able to deal with such large problem instances.