864 resultados para parallel scalability


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recent research in multi-agent systems incorporate fault tolerance concepts. However, the research does not explore the extension and implementation of such ideas for large scale parallel computing systems. The work reported in this paper investigates a swarm array computing approach, namely ‘Intelligent Agents’. In the approach considered a task to be executed on a parallel computing system is decomposed to sub-tasks and mapped onto agents that traverse an abstracted hardware layer. The agents intercommunicate across processors to share information during the event of a predicted core/processor failure and for successfully completing the task. The agents hence contribute towards fault tolerance and towards building reliable systems. The feasibility of the approach is validated by simulations on an FPGA using a multi-agent simulator and implementation of a parallel reduction algorithm on a computer cluster using the Message Passing Interface.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The past decade has witnessed explosive growth of mobile subscribers and services. With the purpose of providing better-swifter-cheaper services, radio network optimisation plays a crucial role but faces enormous challenges. The concept of Dynamic Network Optimisation (DNO), therefore, has been introduced to optimally and continuously adjust network configurations, in response to changes in network conditions and traffic. However, the realization of DNO has been seriously hindered by the bottleneck of optimisation speed performance. An advanced distributed parallel solution is presented in this paper, as to bridge the gap by accelerating the sophisticated proprietary network optimisation algorithm, while maintaining the optimisation quality and numerical consistency. The ariesoACP product from Arieso Ltd serves as the main platform for acceleration. This solution has been prototyped, implemented and tested. Real-project based results exhibit a high scalability and substantial acceleration at an average speed-up of 2.5, 4.9 and 6.1 on a distributed 5-core, 9-core and 16-core system, respectively. This significantly outperforms other parallel solutions such as multi-threading. Furthermore, augmented optimisation outcome, alongside high correctness and self-consistency, have also been fulfilled. Overall, this is a breakthrough towards the realization of DNO.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The increasing demand for cheaper-faster-better services anytime and anywhere has made radio network optimisation much more complex than ever before. In order to dynamically optimise the serving network, Dynamic Network Optimisation (DNO), is proposed as the ultimate solution and future trend. The realization of DNO, however, has been hindered by a significant bottleneck of the optimisation speed as the network complexity grows. This paper presents a multi-threaded parallel solution to accelerate complicated proprietary network optimisation algorithms, under a rigid condition of numerical consistency. ariesoACP product from Arieso Ltd serves as the platform for parallelisation. This parallel solution has been benchmarked and results exhibit a high scalability and a run-time reduction by 11% to 42% based on the technology, subscriber density and blocking rate of a given network in comparison with the original version. Further, it is highly essential that the parallel version produces equivalent optimisation quality in terms of identical optimisation outputs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A connection between a fuzzy neural network model with the mixture of experts network (MEN) modelling approach is established. Based on this linkage, two new neuro-fuzzy MEN construction algorithms are proposed to overcome the curse of dimensionality that is inherent in the majority of associative memory networks and/or other rule based systems. The first construction algorithm employs a function selection manager module in an MEN system. The second construction algorithm is based on a new parallel learning algorithm in which each model rule is trained independently, for which the parameter convergence property of the new learning method is established. As with the first approach, an expert selection criterion is utilised in this algorithm. These two construction methods are equivalent in their effectiveness in overcoming the curse of dimensionality by reducing the dimensionality of the regression vector, but the latter has the additional computational advantage of parallel processing. The proposed algorithms are analysed for effectiveness followed by numerical examples to illustrate their efficacy for some difficult data based modelling problems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present a connectionist searching technique - the Stochastic Diffusion Search (SDS), capable of rapidly locating a specified pattern in a noisy search space. In operation SDS finds the position of the pre-specified pattern or if it does not exist - its best instantiation in the search space. This is achieved via parallel exploration of the whole search space by an ensemble of agents searching in a competitive cooperative manner. We prove mathematically the convergence of stochastic diffusion search. SDS converges to a statistical equilibrium when it locates the best instantiation of the object in the search space. Experiments presented in this paper indicate the high robustness of SDS and show good scalability with problem size. The convergence characteristic of SDS makes it a fully adaptive algorithm and suggests applications in dynamically changing environments.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Proposed is a unique cell histogram architecture which will process k data items in parallel to compute 2q histogram bins per time step. An array of m/2q cells computes an m-bin histogram with a speed-up factor of k; k ⩾ 2 makes it faster than current dual-ported memory implementations. Furthermore, simple mechanisms for conflict-free storing of the histogram bins into an external memory array are discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The adsorption of gases on microporous carbons is still poorly understood, partly because the structure of these carbons is not well known. Here, a model of microporous carbons based on fullerene- like fragments is used as the basis for a theoretical study of Ar adsorption on carbon. First, a simulation box was constructed, containing a plausible arrangement of carbon fragments. Next, using a new Monte Carlo simulation algorithm, two types of carbon fragments were gradually placed into the initial structure to increase its microporosity. Thirty six different microporous carbon structures were generated in this way. Using the method proposed recently by Bhattacharya and Gubbins ( BG), the micropore size distributions of the obtained carbon models and the average micropore diameters were calculated. For ten chosen structures, Ar adsorption isotherms ( 87 K) were simulated via the hyper- parallel tempering Monte Carlo simulation method. The isotherms obtained in this way were described by widely applied methods of microporous carbon characterisation, i. e. Nguyen and Do, Horvath - Kawazoe, high- resolution alpha(a)s plots, adsorption potential distributions and the Dubinin - Astakhov ( DA) equation. From simulated isotherms described by the DA equation, the average micropore diameters were calculated using empirical relationships proposed by different authors and they were compared with those from the BG method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The real-time parallel computation of histograms using an array of pipelined cells is proposed and prototyped in this paper with application to consumer imaging products. The array operates in two modes: histogram computation and histogram reading. The proposed parallel computation method does not use any memory blocks. The resulting histogram bins can be stored into an external memory block in a pipelined fashion for subsequent reading or streaming of the results. The array of cells can be tuned to accommodate the required data path width in a VLSI image processing engine as present in many imaging consumer devices. Synthesis of the architectures presented in this paper in FPGA are shown to compute the real-time histogram of images streamed at over 36 megapixels at 30 frames/s by processing in parallel 1, 2 or 4 pixels per clock cycle.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The development of large scale virtual reality and simulation systems have been mostly driven by the DIS and HLA standards community. A number of issues are coming to light about the applicability of these standards, in their present state, to the support of general multi-user VR systems. This paper pinpoints four issues that must be readdressed before large scale virtual reality systems become accessible to a larger commercial and public domain: a reduction in the effects of network delays; scalable causal event delivery; update control; and scalable reliable communication. Each of these issues is tackled through a common theme of combining wall clock and causal time-related entity behaviour, knowledge of network delays and prediction of entity behaviour, that together overcome many of the effects of network delay.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The development of large scale virtual reality and simulation systems have been mostly driven by the DIS and HLA standards community. A number of issues are coming to light about the applicability of these standards, in their present state, to the support of general multi-user VR systems. This paper pinpoints four issues that must be readdressed before large scale virtual reality systems become accessible to a larger commercial and public domain: a reduction in the effects of network delays; scalable causal event delivery; update control; and scalable reliable communication. Each of these issues is tackled through a common theme of combining wall clock and causal time-related entity behaviour, knowledge of network delays and prediction of entity behaviour, that together overcome many of the effects of network delays.