3 resultados para dynamic binary instrumentation

em CentAUR: Central Archive University of Reading - UK


Relevância:

30.00% 30.00%

Publicador:

Resumo:

One among the most influential and popular data mining methods is the k-Means algorithm for cluster analysis. Techniques for improving the efficiency of k-Means have been largely explored in two main directions. The amount of computation can be significantly reduced by adopting geometrical constraints and an efficient data structure, notably a multidimensional binary search tree (KD-Tree). These techniques allow to reduce the number of distance computations the algorithm performs at each iteration. A second direction is parallel processing, where data and computation loads are distributed over many processing nodes. However, little work has been done to provide a parallel formulation of the efficient sequential techniques based on KD-Trees. Such approaches are expected to have an irregular distribution of computation load and can suffer from load imbalance. This issue has so far limited the adoption of these efficient k-Means variants in parallel computing environments. In this work, we provide a parallel formulation of the KD-Tree based k-Means algorithm for distributed memory systems and address its load balancing issue. Three solutions have been developed and tested. Two approaches are based on a static partitioning of the data set and a third solution incorporates a dynamic load balancing policy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present extensive molecular dynamics simulations of the dynamics of diluted long probe chains entangled with a matrix of shorter chains. The chain lengths of both components are above the entanglement strand length, and the ratio of their lengths is varied over a wide range to cover the crossover from the chain reptation regime to tube Rouse motion regime of the long probe chains. Reducing the matrix chain length results in a faster decay of the dynamic structure factor of the probe chains, in good agreement with recent neutron spin echo experiments. The diffusion of the long chains, measured by the mean square displacements of the monomers and the centers of mass of the chains, demonstrates a systematic speed-up relative to the pure reptation behavior expected for monodisperse melts of sufficiently long polymers. On the other hand, the diffusion of the matrix chains is only weakly perturbed by the diluted long probe chains. The simulation results are qualitatively consistent with the theoretical predictions based on constraint release Rouse model, but a detailed comparison reveals the existence of a broad distribution of the disentanglement rates, which is partly confirmed by an analysis of the packing and diffusion of the matrix chains in the tube region of the probe chains. A coarse-grained simulation model based on the tube Rouse motion model with incorporation of the probability distribution of the tube segment jump rates is developed and shows results qualitatively consistent with the fine scale molecular dynamics simulations. However, we observe a breakdown in the tube Rouse model when the short chain length is decreased to around N-S = 80, which is roughly 3.5 times the entanglement spacing N-e(P) = 23. The location of this transition may be sensitive to the chain bending potential used in our simulations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Exascale systems are the next frontier in high-performance computing and are expected to deliver a performance of the order of 10^18 operations per second using massive multicore processors. Very large- and extreme-scale parallel systems pose critical algorithmic challenges, especially related to concurrency, locality and the need to avoid global communication patterns. This work investigates a novel protocol for dynamic group communication that can be used to remove the global communication requirement and to reduce the communication cost in parallel formulations of iterative data mining algorithms. The protocol is used to provide a communication-efficient parallel formulation of the k-means algorithm for cluster analysis. The approach is based on a collective communication operation for dynamic groups of processes and exploits non-uniform data distributions. Non-uniform data distributions can be either found in real-world distributed applications or induced by means of multidimensional binary search trees. The analysis of the proposed dynamic group communication protocol has shown that it does not introduce significant communication overhead. The parallel clustering algorithm has also been extended to accommodate an approximation error, which allows a further reduction of the communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing elements.