46 resultados para PDE-based parallel preconditioner
Resumo:
In any data mining applications, automated text and text and image retrieval of information is needed. This becomes essential with the growth of the Internet and digital libraries. Our approach is based on the latent semantic indexing (LSI) and the corresponding term-by-document matrix suggested by Berry and his co-authors. Instead of using deterministic methods to find the required number of first "k" singular triplets, we propose a stochastic approach. First, we use Monte Carlo method to sample and to build much smaller size term-by-document matrix (e.g. we build k x k matrix) from where we then find the first "k" triplets using standard deterministic methods. Second, we investigate how we can reduce the problem to finding the "k"-largest eigenvalues using parallel Monte Carlo methods. We apply these methods to the initial matrix and also to the reduced one. The algorithms are running on a cluster of workstations under MPI and results of the experiments arising in textual retrieval of Web documents as well as comparison of the stochastic methods proposed are presented. (C) 2003 IMACS. Published by Elsevier Science B.V. All rights reserved.
Resumo:
In this paper we introduce a new algorithm, based on the successful work of Fathi and Alexandrov, on hybrid Monte Carlo algorithms for matrix inversion and solving systems of linear algebraic equations. This algorithm consists of two parts, approximate inversion by Monte Carlo and iterative refinement using a deterministic method. Here we present a parallel hybrid Monte Carlo algorithm, which uses Monte Carlo to generate an approximate inverse and that improves the accuracy of the inverse with an iterative refinement. The new algorithm is applied efficiently to sparse non-singular matrices. When we are solving a system of linear algebraic equations, Bx = b, the inverse matrix is used to compute the solution vector x = B(-1)b. We present results that show the efficiency of the parallel hybrid Monte Carlo algorithm in the case of sparse matrices.
Resumo:
Large scale air pollution models are powerful tools, designed to meet the increasing demand in different environmental studies. The atmosphere is the most dynamic component of the environment, where the pollutants can be moved quickly on far distnce. Therefore the air pollution modeling must be done in a large computational domain. Moreover, all relevant physical, chemical and photochemical processes must be taken into account. In such complex models operator splitting is very often applied in order to achieve sufficient accuracy as well as efficiency of the numerical solution. The Danish Eulerian Model (DEM) is one of the most advanced such models. Its space domain (4800 × 4800 km) covers Europe, most of the Mediterian and neighboring parts of Asia and the Atlantic Ocean. Efficient parallelization is crucial for the performance and practical capabilities of this huge computational model. Different splitting schemes, based on the main processes mentioned above, have been implemented and tested with respect to accuracy and performance in the new version of DEM. Some numerical results of these experiments are presented in this paper.
Resumo:
As consumers demand more functionality) from their electronic devices and manufacturers supply the demand then electrical power and clock requirements tend to increase, however reassessing system architecture can fortunately lead to suitable counter reductions. To maintain low clock rates and therefore reduce electrical power, this paper presents a parallel convolutional coder for the transmit side in many wireless consumer devices. The coder accepts a parallel data input and directly computes punctured convolutional codes without the need for a separate puncturing operation while the coded bits are available at the output of the coder in a parallel fashion. Also as the computation is in parallel then the coder can be clocked at 7 times slower than the conventional shift-register based convolutional coder (using DVB 7/8 rate). The presented coder is directly relevant to the design of modern low-power consumer devices
Resumo:
The work reported in this paper is motivated by the fact that there is a need to apply autonomic computing concepts to parallel computing systems. Advancing on prior work based on intelligent cores [36], a swarm-array computing approach, this paper focuses on ‘Intelligent agents’ another swarm-array computing approach in which the task to be executed on a parallel computing core is considered as a swarm of autonomous agents. A task is carried to a computing core by carrier agents and is seamlessly transferred between cores in the event of a predicted failure, thereby achieving self-ware objectives of autonomic computing. The feasibility of the proposed swarm-array computing approach is validated on a multi-agent simulator.
Resumo:
We propose a bridge between two important parallel programming paradigms: data parallelism and communicating sequential processes (CSP). Data parallel pipelined architectures obtained with the Alpha language can be embedded in a control intensive application expressed in CSP-based Handel formalism. The interface is formally defined from the semantics of the languages Alpha and Handel. This work will ease the design of compute intensive applications on FPGAs.
Resumo:
This paper is concerned with the uniformization of a system of afine recurrence equations. This transformation is used in the design (or compilation) of highly parallel embedded systems (VLSI systolic arrays, signal processing filters, etc.). In this paper, we present and implement an automatic system to achieve uniformization of systems of afine recurrence equations. We unify the results from many earlier papers, develop some theoretical extensions, and then propose effective uniformization algorithms. Our results can be used in any high level synthesis tool based on polyhedral representation of nested loop computations.
Resumo:
Deep Brain Stimulation (DBS) has been successfully used throughout the world for the treatment of Parkinson's disease symptoms. To control abnormal spontaneous electrical activity in target brain areas DBS utilizes a continuous stimulation signal. This continuous power draw means that its implanted battery power source needs to be replaced every 18–24 months. To prolong the life span of the battery, a technique to accurately recognize and predict the onset of the Parkinson's disease tremors in human subjects and thus implement an on-demand stimulator is discussed here. The approach is to use a radial basis function neural network (RBFNN) based on particle swarm optimization (PSO) and principal component analysis (PCA) with Local Field Potential (LFP) data recorded via the stimulation electrodes to predict activity related to tremor onset. To test this approach, LFPs from the subthalamic nucleus (STN) obtained through deep brain electrodes implanted in a Parkinson patient are used to train the network. To validate the network's performance, electromyographic (EMG) signals from the patient's forearm are recorded in parallel with the LFPs to accurately determine occurrences of tremor, and these are compared to the performance of the network. It has been found that detection accuracies of up to 89% are possible. Performance comparisons have also been made between a conventional RBFNN and an RBFNN based on PSO which show a marginal decrease in performance but with notable reduction in computational overhead.
Resumo:
Recent research in multi-agent systems incorporate fault tolerance concepts, but does not explore the extension and implementation of such ideas for large scale parallel computing systems. The work reported in this paper investigates a swarm array computing approach, namely 'Intelligent Agents'. A task to be executed on a parallel computing system is decomposed to sub-tasks and mapped onto agents that traverse an abstracted hardware layer. The agents intercommunicate across processors to share information during the event of a predicted core/processor failure and for successfully completing the task. The feasibility of the approach is validated by implementation of a parallel reduction algorithm on a computer cluster using the Message Passing Interface.
Resumo:
Recent research in multi-agent systems incorporate fault tolerance concepts, but does not explore the extension and implementation of such ideas for large scale parallel computing systems. The work reported in this paper investigates a swarm array computing approach, namely 'Intelligent Agents'. A task to be executed on a parallel computing system is decomposed to sub-tasks and mapped onto agents that traverse an abstracted hardware layer. The agents intercommunicate across processors to share information during the event of a predicted core/processor failure and for successfully completing the task. The feasibility of the approach is validated by simulations on an FPGA using a multi-agent simulator, and implementation of a parallel reduction algorithm on a computer cluster using the Message Passing Interface.
Resumo:
A connection between a fuzzy neural network model with the mixture of experts network (MEN) modelling approach is established. Based on this linkage, two new neuro-fuzzy MEN construction algorithms are proposed to overcome the curse of dimensionality that is inherent in the majority of associative memory networks and/or other rule based systems. The first construction algorithm employs a function selection manager module in an MEN system. The second construction algorithm is based on a new parallel learning algorithm in which each model rule is trained independently, for which the parameter convergence property of the new learning method is established. As with the first approach, an expert selection criterion is utilised in this algorithm. These two construction methods are equivalent in their effectiveness in overcoming the curse of dimensionality by reducing the dimensionality of the regression vector, but the latter has the additional computational advantage of parallel processing. The proposed algorithms are analysed for effectiveness followed by numerical examples to illustrate their efficacy for some difficult data based modelling problems.
Resumo:
The adsorption of gases on microporous carbons is still poorly understood, partly because the structure of these carbons is not well known. Here, a model of microporous carbons based on fullerene- like fragments is used as the basis for a theoretical study of Ar adsorption on carbon. First, a simulation box was constructed, containing a plausible arrangement of carbon fragments. Next, using a new Monte Carlo simulation algorithm, two types of carbon fragments were gradually placed into the initial structure to increase its microporosity. Thirty six different microporous carbon structures were generated in this way. Using the method proposed recently by Bhattacharya and Gubbins ( BG), the micropore size distributions of the obtained carbon models and the average micropore diameters were calculated. For ten chosen structures, Ar adsorption isotherms ( 87 K) were simulated via the hyper- parallel tempering Monte Carlo simulation method. The isotherms obtained in this way were described by widely applied methods of microporous carbon characterisation, i. e. Nguyen and Do, Horvath - Kawazoe, high- resolution alpha(a)s plots, adsorption potential distributions and the Dubinin - Astakhov ( DA) equation. From simulated isotherms described by the DA equation, the average micropore diameters were calculated using empirical relationships proposed by different authors and they were compared with those from the BG method.
Resumo:
Self-assembly in aqueous solution has been investigated for two Fmoc [Fmoc ¼ N-(fluorenyl)-9-methoxycarbonyl] tetrapeptides comprising the RGDS cell adhesion motif from fibronectin or the scrambled sequence GRDS. The hydrophobic Fmoc unit confers amphiphilicity on the molecules, and introduces aromatic stacking interactions. Circular dichroism and FTIR spectroscopy show that the self-assembly of both peptides at low concentration is dominated by interactions among Fmoc units, although Fmoc-GRDS shows b-sheet features, at lower concentration than Fmoc-RGDS. Fibre X-ray diffraction indicates b-sheet formation by both peptides at sufficiently high concentration. Strong alignment effects are revealed by linear dichroism experiments for Fmoc-GRDS. Cryo-TEM and smallangle X-ray scattering (SAXS) reveal that both samples form fibrils with a diameter of approximately 10 nm. Both Fmoc-tetrapeptides form self-supporting hydrogels at sufficiently high concentration. Dynamic shear rheometry enabled measurements of the moduli for the Fmoc-GRDS hydrogel, however syneresis was observed for the Fmoc-RGDS hydrogel which was significantly less stable to shear. Molecular dynamics computer simulations were carried out considering parallel and antiparallel b-sheet configurations of systems containing 7 and 21 molecules of Fmoc-RGDS or Fmoc-GRDS, the results being analyzed in terms of both intermolecular structural parameters and energy contributions.
Resumo:
This research establishes the feasibility of using a network centric technology, Jini, to provide a grid framework on which to perform parallel video encoding. A solution was implemented using Jini and obtained real-time on demand encoding of a 480 HD video stream. Further, a projection is made concerning the encoding of 1080 HD video in real-time, as the current grid was not powerful enough to achieve this above 15fps. The research found that Jini is able to provide a number of tools and services highly applicable in a grid environment. It is also suitable in terms of performance and responds well to a varying number of grid nodes. The main performance limiter was found to be the network bandwidth allocation, which when loaded with a large number of grid nodes was unable to handle the traffic.