961 resultados para parallel algorithm
Resumo:
Report published in the Proceedings of the National Conference on "Education in the Information Society", Plovdiv, May, 2013
Resumo:
An iterative Monte Carlo algorithm for evaluating linear functionals of the solution of integral equations with polynomial non-linearity is proposed and studied. The method uses a simulation of branching stochastic processes. It is proved that the mathematical expectation of the introduced random variable is equal to a linear functional of the solution. The algorithm uses the so-called almost optimal density function. Numerical examples are considered. Parallel implementation of the algorithm is also realized using the package ATHAPASCAN as an environment for parallel realization.The computational results demonstrate high parallel efficiency of the presented algorithm and give a good solution when almost optimal density function is used as a transition density.
Resumo:
The evolution of wireless communication systems leads to Dynamic Spectrum Allocation for Cognitive Radio, which requires reliable spectrum sensing techniques. Among the spectrum sensing methods proposed in the literature, those that exploit cyclostationary characteristics of radio signals are particularly suitable for communication environments with low signal-to-noise ratios, or with non-stationary noise. However, such methods have high computational complexity that directly raises the power consumption of devices which often have very stringent low-power requirements. We propose a strategy for cyclostationary spectrum sensing with reduced energy consumption. This strategy is based on the principle that p processors working at slower frequencies consume less power than a single processor for the same execution time. We devise a strict relation between the energy savings and common parallel system metrics. The results of simulations show that our strategy promises very significant savings in actual devices.
Resumo:
In this paper, we develop a fast implementation of an hyperspectral coded aperture (HYCA) algorithm on different platforms using OpenCL, an open standard for parallel programing on heterogeneous systems, which includes a wide variety of devices, from dense multicore systems from major manufactures such as Intel or ARM to new accelerators such as graphics processing units (GPUs), field programmable gate arrays (FPGAs), the Intel Xeon Phi and other custom devices. Our proposed implementation of HYCA significantly reduces its computational cost. Our experiments have been conducted using simulated data and reveal considerable acceleration factors. This kind of implementations with the same descriptive language on different architectures are very important in order to really calibrate the possibility of using heterogeneous platforms for efficient hyperspectral imaging processing in real remote sensing missions.
Resumo:
The evolution of wireless communication systems leads to Dynamic Spectrum Allocation for Cognitive Radio, which requires reliable spectrum sensing techniques. Among the spectrum sensing methods proposed in the literature, those that exploit cyclostationary characteristics of radio signals are particularly suitable for communication environments with low signal-to-noise ratios, or with non-stationary noise. However, such methods have high computational complexity that directly raises the power consumption of devices which often have very stringent low-power requirements. We propose a strategy for cyclostationary spectrum sensing with reduced energy consumption. This strategy is based on the principle that p processors working at slower frequencies consume less power than a single processor for the same execution time. We devise a strict relation between the energy savings and common parallel system metrics. The results of simulations show that our strategy promises very significant savings in actual devices.
Resumo:
The Node-based Local Mesh Generation (NLMG) algorithm, which is free of mesh inconsistency, is one of core algorithms in the Node-based Local Finite Element Method (NLFEM) to achieve the seamless link between mesh generation and stiffness matrix calculation, and the seamless link helps to improve the parallel efficiency of FEM. Furthermore, the key to ensure the efficiency and reliability of NLMG is to determine the candidate satellite-node set of a central node quickly and accurately. This paper develops a Fast Local Search Method based on Uniform Bucket (FLSMUB) and a Fast Local Search Method based on Multilayer Bucket (FLSMMB), and applies them successfully to the decisive problems, i.e. presenting the candidate satellite-node set of any central node in NLMG algorithm. Using FLSMUB or FLSMMB, the NLMG algorithm becomes a practical tool to reduce the parallel computation cost of FEM. Parallel numerical experiments validate that either FLSMUB or FLSMMB is fast, reliable and efficient for their suitable problems and that they are especially effective for computing the large-scale parallel problems.
Resumo:
In this paper, the train scheduling problem is modelled as a blocking parallel-machine job shop scheduling (BPMJSS) problem. In the model, trains, single-track sections and multiple-track sections, respectively, are synonymous with jobs, single machines and parallel machines, and an operation is regarded as the movement/traversal of a train across a section. Due to the lack of buffer space, the real-life case should consider blocking or hold-while-wait constraints, which means that a track section cannot release and must hold the train until next section on the routing becomes available. Based on literature review and our analysis, it is very hard to find a feasible complete schedule directly for BPMJSS problems. Firstly, a parallel-machine job-shop-scheduling (PMJSS) problem is solved by an improved shifting bottleneck procedure (SBP) algorithm without considering blocking conditions. Inspired by the proposed SBP algorithm, feasibility satisfaction procedure (FSP) algorithm is developed to solve and analyse the BPMJSS problem, by an alternative graph model that is an extension of the classical disjunctive graph models. The proposed algorithms have been implemented and validated using real-world data from Queensland Rail. Sensitivity analysis has been applied by considering train length, upgrading track sections, increasing train speed and changing bottleneck sections. The outcomes show that the proposed methodology would be a very useful tool for the real-life train scheduling problems
Resumo:
A major focus of research in nanotechnology is the development of novel, high throughput techniques for fabrication of arbitrarily shaped surface nanostructures of sub 100 nm to atomic scale. A related pursuit is the development of simple and efficient means for parallel manipulation and redistribution of adsorbed atoms, molecules and nanoparticles on surfaces – adparticle manipulation. These techniques will be used for the manufacture of nanoscale surface supported functional devices in nanotechnologies such as quantum computing, molecular electronics and lab-on-achip, as well as for modifying surfaces to obtain novel optical, electronic, chemical, or mechanical properties. A favourable approach to formation of surface nanostructures is self-assembly. In self-assembly, nanostructures are grown by aggregation of individual adparticles that diffuse by thermally activated processes on the surface. The passive nature of this process means it is generally not suited to formation of arbitrarily shaped structures. The self-assembly of nanostructures at arbitrary positions has been demonstrated, though these have typically required a pre-patterning treatment of the surface using sophisticated techniques such as electron beam lithography. On the other hand, a parallel adparticle manipulation technique would be suited for directing the selfassembly process to occur at arbitrary positions, without the need for pre-patterning the surface. There is at present a lack of techniques for parallel manipulation and redistribution of adparticles to arbitrary positions on the surface. This is an issue that needs to be addressed since these techniques can play an important role in nanotechnology. In this thesis, we propose such a technique – thermal tweezers. In thermal tweezers, adparticles are redistributed by localised heating of the surface. This locally enhances surface diffusion of adparticles so that they rapidly diffuse away from the heated regions. Using this technique, the redistribution of adparticles to form a desired pattern is achieved by heating the surface at specific regions. In this project, we have focussed on the holographic implementation of this approach, where the surface is heated by holographic patterns of interfering pulsed laser beams. This implementation is suitable for the formation of arbitrarily shaped structures; the only condition is that the shape can be produced by holographic means. In the simplest case, the laser pulses are linearly polarised and intersect to form an interference pattern that is a modulation of intensity along a single direction. Strong optical absorption at the intensity maxima of the interference pattern results in approximately a sinusoidal variation of the surface temperature along one direction. The main aim of this research project is to investigate the feasibility of the holographic implementation of thermal tweezers as an adparticle manipulation technique. Firstly, we investigate theoretically the surface diffusion of adparticles in the presence of sinusoidal modulation of the surface temperature. Very strong redistribution of adparticles is predicted when there is strong interaction between the adparticle and the surface, and the amplitude of the temperature modulation is ~100 K. We have proposed a thin metallic film deposited on a glass substrate heated by interfering laser beams (optical wavelengths) as a means of generating very large amplitude of surface temperature modulation. Indeed, we predict theoretically by numerical solution of the thermal conduction equation that amplitude of the temperature modulation on the metallic film can be much greater than 100 K when heated by nanosecond pulses with an energy ~1 mJ. The formation of surface nanostructures of less than 100 nm in width is predicted at optical wavelengths in this implementation of thermal tweezers. Furthermore, we propose a simple extension to this technique where spatial phase shift of the temperature modulation effectively doubles or triples the resolution. At the same time, increased resolution is predicted by reducing the wavelength of the laser pulses. In addition, we present two distinctly different, computationally efficient numerical approaches for theoretical investigation of surface diffusion of interacting adparticles – the Monte Carlo Interaction Method (MCIM) and the random potential well method (RPWM). Using each of these approaches we have investigated thermal tweezers for redistribution of both strongly and weakly interacting adparticles. We have predicted that strong interactions between adparticles can increase the effectiveness of thermal tweezers, by demonstrating practically complete adparticle redistribution into the low temperature regions of the surface. This is promising from the point of view of thermal tweezers applied to directed self-assembly of nanostructures. Finally, we present a new and more efficient numerical approach to theoretical investigation of thermal tweezers of non-interacting adparticles. In this approach, the local diffusion coefficient is determined from solution of the Fokker-Planck equation. The diffusion equation is then solved numerically using the finite volume method (FVM) to directly obtain the probability density of adparticle position. We compare predictions of this approach to those of the Ermak algorithm solution of the Langevin equation, and relatively good agreement is shown at intermediate and high friction. In the low friction regime, we predict and investigate the phenomenon of ‘optimal’ friction and describe its occurrence due to very long jumps of adparticles as they diffuse from the hot regions of the surface. Future research directions, both theoretical and experimental are also discussed.
Resumo:
Streaming SIMD Extensions (SSE) is a unique feature embedded in the Pentium III and P4 classes of microprocessors. By fully exploiting SSE, parallel algorithms can be implemented on a standard personal computer and a theoretical speedup of four can be achieved. In this paper, we demonstrate the implementation of a parallel LU matrix decomposition algorithm for solving power systems network equations with SSE and discuss advantages and disadvantages of this approach.
Resumo:
Streaming SIMD Extensions (SSE) is a unique feature embedded in the Pentium III class of microprocessors. By fully exploiting SSE, parallel algorithms can be implemented on a standard personal computer and a theoretical speedup of four can be achieved. In this paper, we demonstrate the implementation of a parallel LU matrix decomposition algorithm for solving power systems network equations with SSE and discuss advantages and disadvantages of this approach.
Resumo:
Streaming SIMD Extensions (SSE) is a unique feature embedded in the Pentium III and IV classes of microprocessors. By fully exploiting SSE, parallel algorithms can be implemented on a standard personal computer and a theoretical speedup of four can be achieved. In this paper, we demonstrate the implementation of a parallel LU matrix decomposition algorithm for solving linear systems with SSE and discuss advantages and disadvantages of this approach based on our experimental study.
Resumo:
The paper investigates train scheduling problems when prioritised trains and non-prioritised trains are simultaneously traversed in a single-line rail network. In this case, no-wait conditions arise because the prioritised trains such as express passenger trains should traverse continuously without any interruption. In comparison, non-prioritised trains such as freight trains are allowed to enter the next section immediately if possible or to remain in a section until the next section on the routing becomes available, which is thought of as a relaxation of no-wait conditions. With thorough analysis of the structural properties of the No-Wait Blocking Parallel-Machine Job-Shop-Scheduling (NWBPMJSS) problem that is originated in this research, an innovative generic constructive algorithm (called NWBPMJSS_Liu-Kozan) is proposed to construct the feasible train timetable in terms of a given order of trains. In particular, the proposed NWBPMJSS_Liu-Kozan constructive algorithm comprises several recursively-used sub-algorithms (i.e. Best-Starting-Time-Determination Procedure, Blocking-Time-Determination Procedure, Conflict-Checking Procedure, Conflict-Eliminating Procedure, Tune-up Procedure and Fine-tune Procedure) to guarantee feasibility by satisfying the blocking, no-wait, deadlock-free and conflict-free constraints. A two-stage hybrid heuristic algorithm (NWBPMJSS_Liu-Kozan-BIH) is developed by combining the NWBPMJSS_Liu-Kozan constructive algorithm and the Best-Insertion-Heuristic (BIH) algorithm to find the preferable train schedule in an efficient and economical way. Extensive computational experiments show that the proposed methodology is promising because it can be applied as a standard and fundamental toolbox for identifying, analysing, modelling and solving real-world scheduling problems.
Resumo:
Here we present a sequential Monte Carlo (SMC) algorithm that can be used for any one-at-a-time Bayesian sequential design problem in the presence of model uncertainty where discrete data are encountered. Our focus is on adaptive design for model discrimination but the methodology is applicable if one has a different design objective such as parameter estimation or prediction. An SMC algorithm is run in parallel for each model and the algorithm relies on a convenient estimator of the evidence of each model which is essentially a function of importance sampling weights. Other methods for this task such as quadrature, often used in design, suffer from the curse of dimensionality. Approximating posterior model probabilities in this way allows us to use model discrimination utility functions derived from information theory that were previously difficult to compute except for conjugate models. A major benefit of the algorithm is that it requires very little problem specific tuning. We demonstrate the methodology on three applications, including discriminating between models for decline in motor neuron numbers in patients suffering from neurological diseases such as Motor Neuron disease.
Resumo:
This paper presents a maintenance optimisation method for a multi-state series-parallel system considering economic dependence and state-dependent inspection intervals. The objective function considered in the paper is the average revenue per unit time calculated based on the semi-regenerative theory and the universal generating function (UGF). A new algorithm using the stochastic ordering is also developed in this paper to reduce the search space of maintenance strategies and to enhance the efficiency of optimisation algorithms. A numerical simulation is presented in the study to evaluate the efficiency of the proposed maintenance strategy and optimisation algorithms. The simulation result reveals that maintenance strategies with opportunistic maintenance and state-dependent inspection intervals are more cost-effective when the influence of economic dependence and inspection cost is significant. The study further demonstrates that the optimisation algorithm proposed in this paper has higher computational efficiency than the commonly employed heuristic algorithms.
Resumo:
Considerate amount of research has proposed optimization-based approaches employing various vibration parameters for structural damage diagnosis. The damage detection by these methods is in fact a result of updating the analytical structural model in line with the current physical model. The feasibility of these approaches has been proven. But most of the verification has been done on simple structures, such as beams or plates. In the application on a complex structure, like steel truss bridges, a traditional optimization process will cost massive computational resources and lengthy convergence. This study presents a multi-layer genetic algorithm (ML-GA) to overcome the problem. Unlike the tedious convergence process in a conventional damage optimization process, in each layer, the proposed algorithm divides the GA’s population into groups with a less number of damage candidates; then, the converged population in each group evolves as an initial population of the next layer, where the groups merge to larger groups. In a damage detection process featuring ML-GA, as parallel computation can be implemented, the optimization performance and computational efficiency can be enhanced. In order to assess the proposed algorithm, the modal strain energy correlation (MSEC) has been considered as the objective function. Several damage scenarios of a complex steel truss bridge’s finite element model have been employed to evaluate the effectiveness and performance of ML-GA, against a conventional GA. In both single- and multiple damage scenarios, the analytical and experimental study shows that the MSEC index has achieved excellent damage indication and efficiency using the proposed ML-GA, whereas the conventional GA only converges at a local solution.