8 resultados para parallelization
em CentAUR: Central Archive University of Reading - UK
Resumo:
The Mobile Network Optimization (MNO) technologies have advanced at a tremendous pace in recent years. And the Dynamic Network Optimization (DNO) concept emerged years ago, aimed to continuously optimize the network in response to variations in network traffic and conditions. Yet, DNO development is still at its infancy, mainly hindered by a significant bottleneck of the lengthy optimization runtime. This paper identifies parallelism in greedy MNO algorithms and presents an advanced distributed parallel solution. The solution is designed, implemented and applied to real-life projects whose results yield a significant, highly scalable and nearly linear speedup up to 6.9 and 14.5 on distributed 8-core and 16-core systems respectively. Meanwhile, optimization outputs exhibit self-consistency and high precision compared to their sequential counterpart. This is a milestone in realizing the DNO. Further, the techniques may be applied to similar greedy optimization algorithm based applications.
Resumo:
It has been years since the introduction of the Dynamic Network Optimization (DNO) concept, yet the DNO development is still at its infant stage, largely due to a lack of breakthrough in minimizing the lengthy optimization runtime. Our previous work, a distributed parallel solution, has achieved a significant speed gain. To cater for the increased optimization complexity pressed by the uptake of smartphones and tablets, however, this paper examines the potential areas for further improvement and presents a novel asynchronous distributed parallel design that minimizes the inter-process communications. The new approach is implemented and applied to real-life projects whose results demonstrate an augmented acceleration of 7.5 times on a 16-core distributed system compared to 6.1 of our previous solution. Moreover, there is no degradation in the optimization outcome. This is a solid sprint towards the realization of DNO.
Resumo:
The simulated annealing approach to structure solution from powder diffraction data, as implemented in the DASH program, is easily amenable to parallelization at the individual run level. Very large scale increases in speed of execution can therefore be achieved by distributing individual DASH runs over a network of computers. The GDASH program achieves this by packaging DASH in a form that enables it to run under the Univa UD Grid MP system, which harnesses networks of existing computing resources to perform calculations.
Resumo:
The simulated annealing approach to structure solution from powder diffraction data, as implemented in the DASH program, is easily amenable to parallelization at the individual run level. Modest increases in speed of execution can therefore be achieved by executing individual DASH runs on the individual cores of CPUs.
Resumo:
Large scale air pollution models are powerful tools, designed to meet the increasing demand in different environmental studies. The atmosphere is the most dynamic component of the environment, where the pollutants can be moved quickly on far distnce. Therefore the air pollution modeling must be done in a large computational domain. Moreover, all relevant physical, chemical and photochemical processes must be taken into account. In such complex models operator splitting is very often applied in order to achieve sufficient accuracy as well as efficiency of the numerical solution. The Danish Eulerian Model (DEM) is one of the most advanced such models. Its space domain (4800 × 4800 km) covers Europe, most of the Mediterian and neighboring parts of Asia and the Atlantic Ocean. Efficient parallelization is crucial for the performance and practical capabilities of this huge computational model. Different splitting schemes, based on the main processes mentioned above, have been implemented and tested with respect to accuracy and performance in the new version of DEM. Some numerical results of these experiments are presented in this paper.
Resumo:
Recently major processor manufacturers have announced a dramatic shift in their paradigm to increase computing power over the coming years. Instead of focusing on faster clock speeds and more powerful single core CPUs, the trend clearly goes towards multi core systems. This will also result in a paradigm shift for the development of algorithms for computationally expensive tasks, such as data mining applications. Obviously, work on parallel algorithms is not new per se but concentrated efforts in the many application domains are still missing. Multi-core systems, but also clusters of workstations and even large-scale distributed computing infrastructures provide new opportunities and pose new challenges for the design of parallel and distributed algorithms. Since data mining and machine learning systems rely on high performance computing systems, research on the corresponding algorithms must be on the forefront of parallel algorithm research in order to keep pushing data mining and machine learning applications to be more powerful and, especially for the former, interactive. To bring together researchers and practitioners working in this exciting field, a workshop on parallel data mining was organized as part of PKDD/ECML 2006 (Berlin, Germany). The six contributions selected for the program describe various aspects of data mining and machine learning approaches featuring low to high degrees of parallelism: The first contribution focuses the classic problem of distributed association rule mining and focuses on communication efficiency to improve the state of the art. After this a parallelization technique for speeding up decision tree construction by means of thread-level parallelism for shared memory systems is presented. The next paper discusses the design of a parallel approach for dis- tributed memory systems of the frequent subgraphs mining problem. This approach is based on a hierarchical communication topology to solve issues related to multi-domain computational envi- ronments. The forth paper describes the combined use and the customization of software packages to facilitate a top down parallelism in the tuning of Support Vector Machines (SVM) and the next contribution presents an interesting idea concerning parallel training of Conditional Random Fields (CRFs) and motivates their use in labeling sequential data. The last contribution finally focuses on very efficient feature selection. It describes a parallel algorithm for feature selection from random subsets. Selecting the papers included in this volume would not have been possible without the help of an international Program Committee that has provided detailed reviews for each paper. We would like to also thank Matthew Otey who helped with publicity for the workshop.
Resumo:
The fast increase in the size and number of databases demands data mining approaches that are scalable to large amounts of data. This has led to the exploration of parallel computing technologies in order to perform data mining tasks concurrently using several processors. Parallelization seems to be a natural and cost-effective way to scale up data mining technologies. One of the most important of these data mining technologies is the classification of newly recorded data. This paper surveys advances in parallelization in the field of classification rule induction.
Resumo:
The simulated annealing approach to crystal structure determination from powder diffraction data, as implemented in the DASH program, is readily amenable to parallelization at the individual run level. Very large scale increases in speed of execution can be achieved by distributing individual DASH runs over a network of computers. The CDASH program delivers this by using scalable on-demand computing clusters built on the Amazon Elastic Compute Cloud service. By way of example, a 360 vCPU cluster returned the crystal structure of racemic ornidazole (Z0 = 3, 30 degrees of freedom) ca 40 times faster than a typical modern quad-core desktop CPU. Whilst used here specifically for DASH, this approach is of general applicability to other packages that are amenable to coarse-grained parallelism strategies.