982 resultados para parallel admission algorithm


Relevância:

40.00% 40.00%

Publicador:

Resumo:

We present a parallel algorithm for calculating determinants of matrices in arbitrary precision arithmetic on computer clusters. This algorithm limits data movements between the nodes and computes not only the determinant but also all the minors corresponding to a particular row or column at a little extra cost, and also the determinants and minors of all the leading principal submatrices at no extra cost. We implemented the algorithm in arbitrary precision arithmetic, suitable for very ill conditioned matrices, and empirically estimated the loss of precision. In our scenario the cost of computation is bigger than that of data movement. The algorithm was applied to studies of Riemann’s zeta function.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

With the emergence of the big data age, the issue of how to obtain valuable knowledge from a dataset efficiently and accurately has attracted increasingly attention from both academia and industry. This paper presents a Parallel Random Forest (PRF) algorithm for big data on the Apache Spark platform. The PRF algorithm is optimized based on a hybrid approach combining data-parallel and task-parallel optimization. From the perspective of data-parallel optimization, a vertical data-partitioning method is performed to reduce the data communication cost effectively, and a data-multiplexing method is performed is performed to allow the training dataset to be reused and diminish the volume of data. From the perspective of task-parallel optimization, a dual parallel approach is carried out in the training process of RF, and a task Directed Acyclic Graph (DAG) is created according to the parallel training process of PRF and the dependence of the Resilient Distributed Datasets (RDD) objects. Then, different task schedulers are invoked for the tasks in the DAG. Moreover, to improve the algorithm's accuracy for large, high-dimensional, and noisy data, we perform a dimension-reduction approach in the training process and a weighted voting approach in the prediction process prior to parallelization. Extensive experimental results indicate the superiority and notable advantages of the PRF algorithm over the relevant algorithms implemented by Spark MLlib and other studies in terms of the classification accuracy, performance, and scalability.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Node-based Local Mesh Generation (NLMG) algorithm, which is free of mesh inconsistency, is one of core algorithms in the Node-based Local Finite Element Method (NLFEM) to achieve the seamless link between mesh generation and stiffness matrix calculation, and the seamless link helps to improve the parallel efficiency of FEM. Furthermore, the key to ensure the efficiency and reliability of NLMG is to determine the candidate satellite-node set of a central node quickly and accurately. This paper develops a Fast Local Search Method based on Uniform Bucket (FLSMUB) and a Fast Local Search Method based on Multilayer Bucket (FLSMMB), and applies them successfully to the decisive problems, i.e. presenting the candidate satellite-node set of any central node in NLMG algorithm. Using FLSMUB or FLSMMB, the NLMG algorithm becomes a practical tool to reduce the parallel computation cost of FEM. Parallel numerical experiments validate that either FLSMUB or FLSMMB is fast, reliable and efficient for their suitable problems and that they are especially effective for computing the large-scale parallel problems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, the train scheduling problem is modelled as a blocking parallel-machine job shop scheduling (BPMJSS) problem. In the model, trains, single-track sections and multiple-track sections, respectively, are synonymous with jobs, single machines and parallel machines, and an operation is regarded as the movement/traversal of a train across a section. Due to the lack of buffer space, the real-life case should consider blocking or hold-while-wait constraints, which means that a track section cannot release and must hold the train until next section on the routing becomes available. Based on literature review and our analysis, it is very hard to find a feasible complete schedule directly for BPMJSS problems. Firstly, a parallel-machine job-shop-scheduling (PMJSS) problem is solved by an improved shifting bottleneck procedure (SBP) algorithm without considering blocking conditions. Inspired by the proposed SBP algorithm, feasibility satisfaction procedure (FSP) algorithm is developed to solve and analyse the BPMJSS problem, by an alternative graph model that is an extension of the classical disjunctive graph models. The proposed algorithms have been implemented and validated using real-world data from Queensland Rail. Sensitivity analysis has been applied by considering train length, upgrading track sections, increasing train speed and changing bottleneck sections. The outcomes show that the proposed methodology would be a very useful tool for the real-life train scheduling problems

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A major focus of research in nanotechnology is the development of novel, high throughput techniques for fabrication of arbitrarily shaped surface nanostructures of sub 100 nm to atomic scale. A related pursuit is the development of simple and efficient means for parallel manipulation and redistribution of adsorbed atoms, molecules and nanoparticles on surfaces – adparticle manipulation. These techniques will be used for the manufacture of nanoscale surface supported functional devices in nanotechnologies such as quantum computing, molecular electronics and lab-on-achip, as well as for modifying surfaces to obtain novel optical, electronic, chemical, or mechanical properties. A favourable approach to formation of surface nanostructures is self-assembly. In self-assembly, nanostructures are grown by aggregation of individual adparticles that diffuse by thermally activated processes on the surface. The passive nature of this process means it is generally not suited to formation of arbitrarily shaped structures. The self-assembly of nanostructures at arbitrary positions has been demonstrated, though these have typically required a pre-patterning treatment of the surface using sophisticated techniques such as electron beam lithography. On the other hand, a parallel adparticle manipulation technique would be suited for directing the selfassembly process to occur at arbitrary positions, without the need for pre-patterning the surface. There is at present a lack of techniques for parallel manipulation and redistribution of adparticles to arbitrary positions on the surface. This is an issue that needs to be addressed since these techniques can play an important role in nanotechnology. In this thesis, we propose such a technique – thermal tweezers. In thermal tweezers, adparticles are redistributed by localised heating of the surface. This locally enhances surface diffusion of adparticles so that they rapidly diffuse away from the heated regions. Using this technique, the redistribution of adparticles to form a desired pattern is achieved by heating the surface at specific regions. In this project, we have focussed on the holographic implementation of this approach, where the surface is heated by holographic patterns of interfering pulsed laser beams. This implementation is suitable for the formation of arbitrarily shaped structures; the only condition is that the shape can be produced by holographic means. In the simplest case, the laser pulses are linearly polarised and intersect to form an interference pattern that is a modulation of intensity along a single direction. Strong optical absorption at the intensity maxima of the interference pattern results in approximately a sinusoidal variation of the surface temperature along one direction. The main aim of this research project is to investigate the feasibility of the holographic implementation of thermal tweezers as an adparticle manipulation technique. Firstly, we investigate theoretically the surface diffusion of adparticles in the presence of sinusoidal modulation of the surface temperature. Very strong redistribution of adparticles is predicted when there is strong interaction between the adparticle and the surface, and the amplitude of the temperature modulation is ~100 K. We have proposed a thin metallic film deposited on a glass substrate heated by interfering laser beams (optical wavelengths) as a means of generating very large amplitude of surface temperature modulation. Indeed, we predict theoretically by numerical solution of the thermal conduction equation that amplitude of the temperature modulation on the metallic film can be much greater than 100 K when heated by nanosecond pulses with an energy ~1 mJ. The formation of surface nanostructures of less than 100 nm in width is predicted at optical wavelengths in this implementation of thermal tweezers. Furthermore, we propose a simple extension to this technique where spatial phase shift of the temperature modulation effectively doubles or triples the resolution. At the same time, increased resolution is predicted by reducing the wavelength of the laser pulses. In addition, we present two distinctly different, computationally efficient numerical approaches for theoretical investigation of surface diffusion of interacting adparticles – the Monte Carlo Interaction Method (MCIM) and the random potential well method (RPWM). Using each of these approaches we have investigated thermal tweezers for redistribution of both strongly and weakly interacting adparticles. We have predicted that strong interactions between adparticles can increase the effectiveness of thermal tweezers, by demonstrating practically complete adparticle redistribution into the low temperature regions of the surface. This is promising from the point of view of thermal tweezers applied to directed self-assembly of nanostructures. Finally, we present a new and more efficient numerical approach to theoretical investigation of thermal tweezers of non-interacting adparticles. In this approach, the local diffusion coefficient is determined from solution of the Fokker-Planck equation. The diffusion equation is then solved numerically using the finite volume method (FVM) to directly obtain the probability density of adparticle position. We compare predictions of this approach to those of the Ermak algorithm solution of the Langevin equation, and relatively good agreement is shown at intermediate and high friction. In the low friction regime, we predict and investigate the phenomenon of ‘optimal’ friction and describe its occurrence due to very long jumps of adparticles as they diffuse from the hot regions of the surface. Future research directions, both theoretical and experimental are also discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Streaming SIMD Extensions (SSE) is a unique feature embedded in the Pentium III and P4 classes of microprocessors. By fully exploiting SSE, parallel algorithms can be implemented on a standard personal computer and a theoretical speedup of four can be achieved. In this paper, we demonstrate the implementation of a parallel LU matrix decomposition algorithm for solving power systems network equations with SSE and discuss advantages and disadvantages of this approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Streaming SIMD Extensions (SSE) is a unique feature embedded in the Pentium III class of microprocessors. By fully exploiting SSE, parallel algorithms can be implemented on a standard personal computer and a theoretical speedup of four can be achieved. In this paper, we demonstrate the implementation of a parallel LU matrix decomposition algorithm for solving power systems network equations with SSE and discuss advantages and disadvantages of this approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Streaming SIMD Extensions (SSE) is a unique feature embedded in the Pentium III and IV classes of microprocessors. By fully exploiting SSE, parallel algorithms can be implemented on a standard personal computer and a theoretical speedup of four can be achieved. In this paper, we demonstrate the implementation of a parallel LU matrix decomposition algorithm for solving linear systems with SSE and discuss advantages and disadvantages of this approach based on our experimental study.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper investigates train scheduling problems when prioritised trains and non-prioritised trains are simultaneously traversed in a single-line rail network. In this case, no-wait conditions arise because the prioritised trains such as express passenger trains should traverse continuously without any interruption. In comparison, non-prioritised trains such as freight trains are allowed to enter the next section immediately if possible or to remain in a section until the next section on the routing becomes available, which is thought of as a relaxation of no-wait conditions. With thorough analysis of the structural properties of the No-Wait Blocking Parallel-Machine Job-Shop-Scheduling (NWBPMJSS) problem that is originated in this research, an innovative generic constructive algorithm (called NWBPMJSS_Liu-Kozan) is proposed to construct the feasible train timetable in terms of a given order of trains. In particular, the proposed NWBPMJSS_Liu-Kozan constructive algorithm comprises several recursively-used sub-algorithms (i.e. Best-Starting-Time-Determination Procedure, Blocking-Time-Determination Procedure, Conflict-Checking Procedure, Conflict-Eliminating Procedure, Tune-up Procedure and Fine-tune Procedure) to guarantee feasibility by satisfying the blocking, no-wait, deadlock-free and conflict-free constraints. A two-stage hybrid heuristic algorithm (NWBPMJSS_Liu-Kozan-BIH) is developed by combining the NWBPMJSS_Liu-Kozan constructive algorithm and the Best-Insertion-Heuristic (BIH) algorithm to find the preferable train schedule in an efficient and economical way. Extensive computational experiments show that the proposed methodology is promising because it can be applied as a standard and fundamental toolbox for identifying, analysing, modelling and solving real-world scheduling problems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Here we present a sequential Monte Carlo (SMC) algorithm that can be used for any one-at-a-time Bayesian sequential design problem in the presence of model uncertainty where discrete data are encountered. Our focus is on adaptive design for model discrimination but the methodology is applicable if one has a different design objective such as parameter estimation or prediction. An SMC algorithm is run in parallel for each model and the algorithm relies on a convenient estimator of the evidence of each model which is essentially a function of importance sampling weights. Other methods for this task such as quadrature, often used in design, suffer from the curse of dimensionality. Approximating posterior model probabilities in this way allows us to use model discrimination utility functions derived from information theory that were previously difficult to compute except for conjugate models. A major benefit of the algorithm is that it requires very little problem specific tuning. We demonstrate the methodology on three applications, including discriminating between models for decline in motor neuron numbers in patients suffering from neurological diseases such as Motor Neuron disease.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a maintenance optimisation method for a multi-state series-parallel system considering economic dependence and state-dependent inspection intervals. The objective function considered in the paper is the average revenue per unit time calculated based on the semi-regenerative theory and the universal generating function (UGF). A new algorithm using the stochastic ordering is also developed in this paper to reduce the search space of maintenance strategies and to enhance the efficiency of optimisation algorithms. A numerical simulation is presented in the study to evaluate the efficiency of the proposed maintenance strategy and optimisation algorithms. The simulation result reveals that maintenance strategies with opportunistic maintenance and state-dependent inspection intervals are more cost-effective when the influence of economic dependence and inspection cost is significant. The study further demonstrates that the optimisation algorithm proposed in this paper has higher computational efficiency than the commonly employed heuristic algorithms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Considerate amount of research has proposed optimization-based approaches employing various vibration parameters for structural damage diagnosis. The damage detection by these methods is in fact a result of updating the analytical structural model in line with the current physical model. The feasibility of these approaches has been proven. But most of the verification has been done on simple structures, such as beams or plates. In the application on a complex structure, like steel truss bridges, a traditional optimization process will cost massive computational resources and lengthy convergence. This study presents a multi-layer genetic algorithm (ML-GA) to overcome the problem. Unlike the tedious convergence process in a conventional damage optimization process, in each layer, the proposed algorithm divides the GA’s population into groups with a less number of damage candidates; then, the converged population in each group evolves as an initial population of the next layer, where the groups merge to larger groups. In a damage detection process featuring ML-GA, as parallel computation can be implemented, the optimization performance and computational efficiency can be enhanced. In order to assess the proposed algorithm, the modal strain energy correlation (MSEC) has been considered as the objective function. Several damage scenarios of a complex steel truss bridge’s finite element model have been employed to evaluate the effectiveness and performance of ML-GA, against a conventional GA. In both single- and multiple damage scenarios, the analytical and experimental study shows that the MSEC index has achieved excellent damage indication and efficiency using the proposed ML-GA, whereas the conventional GA only converges at a local solution.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper introduces a parallel implementation of an agent-based model applied to electricity distribution grids. A fine-grained shared memory parallel implementation is presented, detailing the way the agents are grouped and executed on a multi-threaded machine, as well as the way the model is built (in a composable manner) which is an aid to the parallelisation. Current results show a medium level speedup of 2.6, but improvements are expected by incor-porating newer distributed or parallel ABM schedulers into this implementa-tion. While domain-specific, this parallel algorithm can be applied to similarly structured ABMs (directed acyclic graphs).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

To minimise the number of load sheddings in a microgrid (MG) during autonomous operation, islanded neighbour MGs can be interconnected if they are on a self-healing network and an extra generation capacity is available in the distributed energy resources (DER) of one of the MGs. In this way, the total load in the system of interconnected MGs can be shared by all the DERs within those MGs. However, for this purpose, carefully designed self-healing and supply restoration control algorithm, protection systems and communication infrastructure are required at the network and MG levels. In this study, first, a hierarchical control structure is discussed for interconnecting the neighbour autonomous MGs where the introduced primary control level is the main focus of this study. Through the developed primary control level, this study demonstrates how the parallel DERs in the system of multiple interconnected autonomous MGs can properly share the load of the system. This controller is designed such that the converter-interfaced DERs operate in a voltage-controlled mode following a decentralised power sharing algorithm based on droop control. DER converters are controlled based on a per-phase technique instead of a conventional direct-quadratic transformation technique. In addition, linear quadratic regulator-based state feedback controllers, which are more stable than conventional proportional integrator controllers, are utilised to prevent instability and weak dynamic performances of the DERs when autonomous MGs are interconnected. The efficacy of the primary control level of the DERs in the system of multiple interconnected autonomous MGs is validated through the PSCAD/EMTDC simulations considering detailed dynamic models of DERs and converters.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

MapReduce is a computation model for processing large data sets in parallel on large clusters of machines, in a reliable, fault-tolerant manner. A MapReduce computation is broken down into a number of map tasks and reduce tasks, which are performed by so called mappers and reducers, respectively. The placement of the mappers and reducers on the machines directly affects the performance and cost of the MapReduce computation. From the computational point of view, the mappers/reducers placement problem is a generation of the classical bin packing problem, which is NPcomplete. Thus, in this paper we propose a new grouping genetic algorithm for the mappers/reducers placement problem in cloud computing. Compared with the original one, our grouping genetic algorithm uses an innovative coding scheme and also eliminates the inversion operator which is an essential operator in the original grouping genetic algorithm. The new grouping genetic algorithm is evaluated by experiments and the experimental results show that it is much more efficient than four popular algorithms for the problem, including the original grouping genetic algorithm.