922 resultados para Parallel numerical algorithms


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Global communication requirements and load imbalance of some parallel data mining algorithms are the major obstacles to exploit the computational power of large-scale systems. This work investigates how non-uniform data distributions can be exploited to remove the global communication requirement and to reduce the communication cost in iterative parallel data mining algorithms. In particular, the analysis focuses on one of the most influential and popular data mining methods, the k-means algorithm for cluster analysis. The straightforward parallel formulation of the k-means algorithm requires a global reduction operation at each iteration step, which hinders its scalability. This work studies a different parallel formulation of the algorithm where the requirement of global communication can be relaxed while still providing the exact solution of the centralised k-means algorithm. The proposed approach exploits a non-uniform data distribution which can be either found in real world distributed applications or can be induced by means of multi-dimensional binary search trees. The approach can also be extended to accommodate an approximation error which allows a further reduction of the communication costs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A class identification algorithms is introduced for Gaussian process(GP)models.The fundamental approach is to propose a new kernel function which leads to a covariance matrix with low rank,a property that is consequently exploited for computational efficiency for both model parameter estimation and model predictions.The objective of either maximizing the marginal likelihood or the Kullback–Leibler (K–L) divergence between the estimated output probability density function(pdf)and the true pdf has been used as respective cost functions.For each cost function,an efficient coordinate descent algorithm is proposed to estimate the kernel parameters using a one dimensional derivative free search, and noise variance using a fast gradient descent algorithm. Numerical examples are included to demonstrate the effectiveness of the new identification approaches.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the prospect of exascale computing, computational methods requiring only local data become especially attractive. Consequently, the typical domain decomposition of atmospheric models means horizontally-explicit vertically-implicit (HEVI) time-stepping schemes warrant further attention. In this analysis, Runge-Kutta implicit-explicit schemes from the literature are analysed for their stability and accuracy using a von Neumann stability analysis of two linear systems. Attention is paid to the numerical phase to indicate the behaviour of phase and group velocities. Where the analysis is tractable, analytically derived expressions are considered. For more complicated cases, amplification factors have been numerically generated and the associated amplitudes and phase diagnosed. Analysis of a system describing acoustic waves has necessitated attributing the three resultant eigenvalues to the three physical modes of the system. To do so, a series of algorithms has been devised to track the eigenvalues across the frequency space. The result enables analysis of whether the schemes exactly preserve the non-divergent mode; and whether there is evidence of spurious reversal in the direction of group velocities or asymmetry in the damping for the pair of acoustic modes. Frequency ranges that span next-generation high-resolution weather models to coarse-resolution climate models are considered; and a comparison is made of errors accumulated from multiple stability-constrained shorter time-steps from the HEVI scheme with a single integration from a fully implicit scheme over the same time interval. Two schemes, “Trap2(2,3,2)” and “UJ3(1,3,2)”, both already used in atmospheric models, are identified as offering consistently good stability and representation of phase across all the analyses. Furthermore, according to a simple measure of computational cost, “Trap2(2,3,2)” is the least expensive.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A great number of studies on wind conditions in passages between slab-type buildings have been conducted in the past. However, wind conditions under different structure and configuration of buildings is still unclear and studies existed still can’t provide guidance on urban planning and design, due to the complexity of buildings and aerodynamics. The aim of this paper is to provide more insight in the mechanism of wind conditions in passages. In this paper, a simplified passage model with non-parallel buildings is developed on the basis of the wind tunnel experiments conducted by Blocken et al. (2008). Numerical simulation based on CFD is employed for a detailed investigation of the wind environment in passages between two long narrow buildings with different directions and model validation is performed by comparing numerical results with corresponding wind tunnel measurements.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We study a brightening of the Lyman-alpha emission in the cusp which occurred in response to a short-lived southward turning of the interplanetary magnetic field (IMF) during a period of strongly enhanced solar wind plasma concentration. The cusp proton emission is detected using the SI-12 channel of the FUV imager on the IMAGE spacecraft. Analysis of the IMF observations recorded by the ACE and Wind spacecraft reveals that the assumption of a constant propagation lag from the upstream spacecraft to the Earth is not adequate for these high time-resolution studies. The variations of the southward IMF component observed by ACE and Wind allow for the calculation of the ACE-to-Earth lag as a function of time. Application of the derived propagation delays reveals that the intensity of the cusp emission varied systematically with the IMF clock angle, the relationship being particularly striking when the intensity is normalised to allow for the variation in the upstream solar wind proton concentration. The latitude of the cusp migrated equatorward while the lagged IMF pointed southward, confirming the lag calculation and indicating ongoing magnetopause reconnection. Dayside convection, as monitored by the SuperDARN network of radars, responded rapidly to the IMF changes but lagged behind the cusp proton emission response: this is shown to be as predicted by the model of flow excitation by Cowley and Lockwood (1992). We use the numerical cusp ion precipitation model of Lockwood and Davis (1996), along with modelled Lyman-_ emission efficiency and the SI-12 instrument response, to investigate the effect of the sheath field clock angle on the acceleration of ions on crossing the dayside magnetopause. This modelling reveals that the emission commences on each reconnected field line 2–2.5min after it is opened and peaks 3–5 min after it is opened. We discuss how comparison of the Lyman-alpha intensities with oxygen emissions observed simultaneously by the SI-13 channel of the FUV instrument offers an opportunity to test whether or not the clock angle dependence is consistent with the “component” or the “anti-parallel” reconnection hypothesis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Advances in hardware technologies allow to capture and process data in real-time and the resulting high throughput data streams require novel data mining approaches. The research area of Data Stream Mining (DSM) is developing data mining algorithms that allow us to analyse these continuous streams of data in real-time. The creation and real-time adaption of classification models from data streams is one of the most challenging DSM tasks. Current classifiers for streaming data address this problem by using incremental learning algorithms. However, even so these algorithms are fast, they are challenged by high velocity data streams, where data instances are incoming at a fast rate. This is problematic if the applications desire that there is no or only a very little delay between changes in the patterns of the stream and absorption of these patterns by the classifier. Problems of scalability to Big Data of traditional data mining algorithms for static (non streaming) datasets have been addressed through the development of parallel classifiers. However, there is very little work on the parallelisation of data stream classification techniques. In this paper we investigate K-Nearest Neighbours (KNN) as the basis for a real-time adaptive and parallel methodology for scalable data stream classification tasks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Mobile Network Optimization (MNO) technologies have advanced at a tremendous pace in recent years. And the Dynamic Network Optimization (DNO) concept emerged years ago, aimed to continuously optimize the network in response to variations in network traffic and conditions. Yet, DNO development is still at its infancy, mainly hindered by a significant bottleneck of the lengthy optimization runtime. This paper identifies parallelism in greedy MNO algorithms and presents an advanced distributed parallel solution. The solution is designed, implemented and applied to real-life projects whose results yield a significant, highly scalable and nearly linear speedup up to 6.9 and 14.5 on distributed 8-core and 16-core systems respectively. Meanwhile, optimization outputs exhibit self-consistency and high precision compared to their sequential counterpart. This is a milestone in realizing the DNO. Further, the techniques may be applied to similar greedy optimization algorithm based applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It has been years since the introduction of the Dynamic Network Optimization (DNO) concept, yet the DNO development is still at its infant stage, largely due to a lack of breakthrough in minimizing the lengthy optimization runtime. Our previous work, a distributed parallel solution, has achieved a significant speed gain. To cater for the increased optimization complexity pressed by the uptake of smartphones and tablets, however, this paper examines the potential areas for further improvement and presents a novel asynchronous distributed parallel design that minimizes the inter-process communications. The new approach is implemented and applied to real-life projects whose results demonstrate an augmented acceleration of 7.5 times on a 16-core distributed system compared to 6.1 of our previous solution. Moreover, there is no degradation in the optimization outcome. This is a solid sprint towards the realization of DNO.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A recent study conducted by Blocken et al. (Numerical study on the existence of the Venturi effect in passages between perpendicular buildings. Journal of Engineering Mechanics, 2008,134: 1021-1028) challenged the popular view of the existence of the ‘Venturi effect’ in building passages as the wind is exposed to an open boundary. The present research extends the work of Blocken et al. (2008a) into a more general setup with the building orientation varying from 0° to 180° using CFD simulations. Our results reveal that the passage flow is mainly determined by the combination of corner streams. It is also shown that converging passages have a higher wind-blocking effect compared to diverging passages, explained by a lower wind speed and higher drag coefficient. Fluxes on the top plane of the passage volume reverse from outflow to inflow in the cases of α=135°, 150° and 165°. A simple mathematical expression to explain the relationship between the flux ratio and the geometric parameters has been developed to aid wind design in an urban neighborhood. In addition, a converging passage with α=15° is recommended for urban wind design in cold and temperate climates since the passage flow changes smoothly and a relatively lower wind speed is expected compared with that where there are no buildings. While for the high-density urban area in (sub)tropical climates such as Hong Kong where there is a desire for more wind, a diverging passage with α=150° is a better choice to promote ventilation at the pedestrian level.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An equation of Monge-Ampère type has, for the first time, been solved numerically on the surface of the sphere in order to generate optimally transported (OT) meshes, equidistributed with respect to a monitor function. Optimal transport generates meshes that keep the same connectivity as the original mesh, making them suitable for r-adaptive simulations, in which the equations of motion can be solved in a moving frame of reference in order to avoid mapping the solution between old and new meshes and to avoid load balancing problems on parallel computers. The semi-implicit solution of the Monge-Ampère type equation involves a new linearisation of the Hessian term, and exponential maps are used to map from old to new meshes on the sphere. The determinant of the Hessian is evaluated as the change in volume between old and new mesh cells, rather than using numerical approximations to the gradients. OT meshes are generated to compare with centroidal Voronoi tesselations on the sphere and are found to have advantages and disadvantages; OT equidistribution is more accurate, the number of iterations to convergence is independent of the mesh size, face skewness is reduced and the connectivity does not change. However anisotropy is higher and the OT meshes are non-orthogonal. It is shown that optimal transport on the sphere leads to meshes that do not tangle. However, tangling can be introduced by numerical errors in calculating the gradient of the mesh potential. Methods for alleviating this problem are explored. Finally, OT meshes are generated using observed precipitation as a monitor function, in order to demonstrate the potential power of the technique.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A finite difference technique, based on a projection method, is developed for solving the dynamic three-dimensional Ericksen-Leslie equations for nematic liquid crystals subject to a strong magnetic field. The governing equations in this situation are derived using primitive variables and are solved using the ideas behind the GENSMAC methodology (Tome and McKee [32]; Tome et al. [34]). The resulting numerical technique is then validated by comparing the numerical solution against an analytic solution for steady three-dimensional flow between two-parallel plates subject to a strong magnetic field. The validated code is then employed to solve channel flow for which there is no analytic solution. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The InteGrade middleware intends to exploit the idle time of computing resources in computer laboratories. In this work we investigate the performance of running parallel applications with communication among processors on the InteGrade grid. As costly communication on a grid can be prohibitive, we explore the so-called systolic or wavefront paradigm to design the parallel algorithms in which no global communication is used. To evaluate the InteGrade middleware we considered three parallel algorithms that solve the matrix chain product problem, the 0-1 Knapsack Problem, and the local sequence alignment problem, respectively. We show that these three applications running under the InteGrade middleware and MPI take slightly more time than the same applications running on a cluster with only LAM-MPI support. The results can be considered promising and the time difference between the two is not substantial. The overhead of the InteGrade middleware is acceptable, in view of the benefits obtained to facilitate the use of grid computing by the user. These benefits include job submission, checkpointing, security, job migration, etc. Copyright (C) 2009 John Wiley & Sons, Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A bipartite graph G = (V, W, E) is convex if there exists an ordering of the vertices of W such that, for each v. V, the neighbors of v are consecutive in W. We describe both a sequential and a BSP/CGM algorithm to find a maximum independent set in a convex bipartite graph. The sequential algorithm improves over the running time of the previously known algorithm and the BSP/CGM algorithm is a parallel version of the sequential one. The complexity of the algorithms does not depend on |W|.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Relevant results for (sub-)distribution functions related to parallel systems are discussed. The reverse hazard rate is defined using the product integral. Consequently, the restriction of absolute continuity for the involved distributions can be relaxed. The only restriction is that the sets of discontinuity points of the parallel distributions have to be disjointed. Nonparametric Bayesian estimators of all survival (sub-)distribution functions are derived. Dual to the series systems that use minimum life times as observations, the parallel systems record the maximum life times. Dirichlet multivariate processes forming a class of prior distributions are considered for the nonparametric Bayesian estimation of the component distribution functions, and the system reliability. For illustration, two striking numerical examples are presented.