Biblioteca Digital

996 resultados para parallel technique

Parallel Hybrid Monte Carlo Algorithms for Matrix Computations

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we consider hybrid (fast stochastic approximation and deterministic refinement) algorithms for Matrix Inversion (MI) and Solving Systems of Linear Equations (SLAE). Monte Carlo methods are used for the stochastic approximation, since it is known that they are very efficient in finding a quick rough approximation of the element or a row of the inverse matrix or finding a component of the solution vector. We show how the stochastic approximation of the MI can be combined with a deterministic refinement procedure to obtain MI with the required precision and further solve the SLAE using MI. We employ a splitting A = D – C of a given non-singular matrix A, where D is a diagonal dominant matrix and matrix C is a diagonal matrix. In our algorithm for solving SLAE and MI different choices of D can be considered in order to control the norm of matrix T = D –1C, of the resulting SLAE and to minimize the number of the Markov Chains required to reach given precision. Further we run the algorithms on a mini-Grid and investigate their efficiency depending on the granularity. Corresponding experimental results are presented.

Parallel Monte Carlo algorithms for information retrieval

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In any data mining applications, automated text and text and image retrieval of information is needed. This becomes essential with the growth of the Internet and digital libraries. Our approach is based on the latent semantic indexing (LSI) and the corresponding term-by-document matrix suggested by Berry and his co-authors. Instead of using deterministic methods to find the required number of first "k" singular triplets, we propose a stochastic approach. First, we use Monte Carlo method to sample and to build much smaller size term-by-document matrix (e.g. we build k x k matrix) from where we then find the first "k" triplets using standard deterministic methods. Second, we investigate how we can reduce the problem to finding the "k"-largest eigenvalues using parallel Monte Carlo methods. We apply these methods to the initial matrix and also to the reduced one. The algorithms are running on a cluster of workstations under MPI and results of the experiments arising in textual retrieval of Web documents as well as comparison of the stochastic methods proposed are presented. (C) 2003 IMACS. Published by Elsevier Science B.V. All rights reserved.

A lightweight technique for assessing risks in requirements analysis

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A simple and practical technique for assessing the risks, that is, the potential for error, and consequent loss, in software system development, acquired during a requirements engineering phase is described. The technique uses a goal-based requirements analysis as a framework to identify and rate a set of key issues in order to arrive at estimates of the feasibility and adequacy of the requirements. The technique is illustrated and how it has been applied to a real systems development project is shown. How problems in this project could have been identified earlier is shown, thereby avoiding costly additional work and unhappy users.

A sparse parallel hybrid Monte Carlo algorithm for matrix computations

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we introduce a new algorithm, based on the successful work of Fathi and Alexandrov, on hybrid Monte Carlo algorithms for matrix inversion and solving systems of linear algebraic equations. This algorithm consists of two parts, approximate inversion by Monte Carlo and iterative refinement using a deterministic method. Here we present a parallel hybrid Monte Carlo algorithm, which uses Monte Carlo to generate an approximate inverse and that improves the accuracy of the inverse with an iterative refinement. The new algorithm is applied efficiently to sparse non-singular matrices. When we are solving a system of linear algebraic equations, Bx = b, the inverse matrix is used to compute the solution vector x = B(-1)b. We present results that show the efficiency of the parallel hybrid Monte Carlo algorithm in the case of sparse matrices.

A clocking technique for FPGA pipelined designs

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a clocking pipeline technique referred to as a single-pulse pipeline (PP-Pipeline) and applies it to the problem of mapping pipelined circuits to a Field Programmable Gate Array (FPGA). A PP-pipeline replicates the operation of asynchronous micropipelined control mechanisms using synchronous-orientated logic resources commonly found in FPGA devices. Consequently, circuits with an asynchronous-like pipeline operation can be efficiently synthesized using a synchronous design methodology. The technique can be extended to include data-completion circuitry to take advantage of variable data-completion processing time in synchronous pipelined designs. It is also shown that the PP-pipeline reduces the clock tree power consumption of pipelined circuits. These potential applications are demonstrated by post-synthesis simulation of FPGA circuits. (C) 2004 Elsevier B.V. All rights reserved.

An orthogonal forward regression technique for sparse kernel density estimation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Using the classical Parzen window (PW) estimate as the desired response, the kernel density estimation is formulated as a regression problem and the orthogonal forward regression technique is adopted to construct sparse kernel density (SKD) estimates. The proposed algorithm incrementally minimises a leave-one-out test score to select a sparse kernel model, and a local regularisation method is incorporated into the density construction process to further enforce sparsity. The kernel weights of the selected sparse model are finally updated using the multiplicative nonnegative quadratic programming algorithm, which ensures the nonnegative and unity constraints for the kernel weights and has the desired ability to reduce the model size further. Except for the kernel width, the proposed method has no other parameters that need tuning, and the user is not required to specify any additional criterion to terminate the density construction procedure. Several examples demonstrate the ability of this simple regression-based approach to effectively construct a SKID estimate with comparable accuracy to that of the full-sample optimised PW density estimate. (c) 2007 Elsevier B.V. All rights reserved.

Computational complexity of weighted splitting schemes on parallel computers

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In models of complicated physical-chemical processes operator splitting is very often applied in order to achieve sufficient accuracy as well as efficiency of the numerical solution. The recently rediscovered weighted splitting schemes have the great advantage of being parallelizable on operator level, which allows us to reduce the computational time if parallel computers are used. In this paper, the computational times needed for the weighted splitting methods are studied in comparison with the sequential (S) splitting and the Marchuk-Strang (MSt) splitting and are illustrated by numerical experiments performed by use of simplified versions of the Danish Eulerian model (DEM).

Empirical mode decomposition: a novel technique for the study of tremor time series

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tremor is a clinical feature characterized by oscillations of a part of the body. The detection and study of tremor is an important step in investigations seeking to explain underlying control strategies of the central nervous system under natural (or physiological) and pathological conditions. It is well established that tremorous activity is composed of deterministic and stochastic components. For this reason, the use of digital signal processing techniques (DSP) which take into account the nonlinearity and nonstationarity of such signals may bring new information into the signal analysis which is often obscured by traditional linear techniques (e.g. Fourier analysis). In this context, this paper introduces the application of the empirical mode decomposition (EMD) and Hilbert spectrum (HS), which are relatively new DSP techniques for the analysis of nonlinear and nonstationary time-series, for the study of tremor. Our results, obtained from the analysis of experimental signals collected from 31 patients with different neurological conditions, showed that the EMD could automatically decompose acquired signals into basic components, called intrinsic mode functions (IMFs), representing tremorous and voluntary activity. The identification of a physical meaning for IMFs in the context of tremor analysis suggests an alternative and new way of detecting tremorous activity. These results may be relevant for those applications requiring automatic detection of tremor. Furthermore, the energy of IMFs was visualized as a function of time and frequency by means of the HS. This analysis showed that the variation of energy of tremorous and voluntary activity could be distinguished and characterized on the HS. Such results may be relevant for those applications aiming to identify neurological disorders. In general, both the HS and EMD demonstrated to be very useful to perform objective analysis of any kind of tremor and can therefore be potentially used to perform functional assessment.

Parallel implementation and one year experiments with the Danish Eulerian Model

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Large scale air pollution models are powerful tools, designed to meet the increasing demand in different environmental studies. The atmosphere is the most dynamic component of the environment, where the pollutants can be moved quickly on far distnce. Therefore the air pollution modeling must be done in a large computational domain. Moreover, all relevant physical, chemical and photochemical processes must be taken into account. In such complex models operator splitting is very often applied in order to achieve sufficient accuracy as well as efficiency of the numerical solution. The Danish Eulerian Model (DEM) is one of the most advanced such models. Its space domain (4800 × 4800 km) covers Europe, most of the Mediterian and neighboring parts of Asia and the Atlantic Ocean. Efficient parallelization is crucial for the performance and practical capabilities of this huge computational model. Different splitting schemes, based on the main processes mentioned above, have been implemented and tested with respect to accuracy and performance in the new version of DEM. Some numerical results of these experiments are presented in this paper.

A comparative study of Java and C performance in two large-scale parallel applications

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the 1990s the Message Passing Interface Forum defined MPI bindings for Fortran, C, and C++. With the success of MPI these relatively conservative languages have continued to dominate in the parallel computing community. There are compelling arguments in favour of more modern languages like Java. These include portability, better runtime error checking, modularity, and multi-threading. But these arguments have not converted many HPC programmers, perhaps due to the scarcity of full-scale scientific Java codes, and the lack of evidence for performance competitive with C or Fortran. This paper tries to redress this situation by porting two scientific applications to Java. Both of these applications are parallelized using our thread-safe Java messaging system—MPJ Express. The first application is the Gadget-2 code, which is a massively parallel structure formation code for cosmological simulations. The second application uses the finite-domain time-difference method for simulations in the area of computational electromagnetics. We evaluate and compare the performance of the Java and C versions of these two scientific applications, and demonstrate that the Java codes can achieve performance comparable with legacy applications written in conventional HPC languages. Copyright © 2009 John Wiley & Sons, Ltd.

A parallel convolutional coder including embedded puncturing with application to consumer devices

Relevância:

20.00% 20.00%

Publicador:

Resumo:

As consumers demand more functionality) from their electronic devices and manufacturers supply the demand then electrical power and clock requirements tend to increase, however reassessing system architecture can fortunately lead to suitable counter reductions. To maintain low clock rates and therefore reduce electrical power, this paper presents a parallel convolutional coder for the transmit side in many wireless consumer devices. The coder accepts a parallel data input and directly computes punctured convolutional codes without the need for a separate puncturing operation while the coded bits are available at the output of the coder in a parallel fashion. Also as the computation is in parallel then the coder can be clocked at 7 times slower than the conventional shift-register based convolutional coder (using DVB 7/8 rate). The presented coder is directly relevant to the design of modern low-power consumer devices

Applying autonomic computing concepts to parallel computing using intelligent agents

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The work reported in this paper is motivated by the fact that there is a need to apply autonomic computing concepts to parallel computing systems. Advancing on prior work based on intelligent cores [36], a swarm-array computing approach, this paper focuses on ‘Intelligent agents’ another swarm-array computing approach in which the task to be executed on a parallel computing core is considered as a swarm of autonomous agents. A task is carried to a computing core by carrier agents and is seamlessly transferred between cores in the event of a predicted failure, thereby achieving self-ware objectives of autonomic computing. The feasibility of the proposed swarm-array computing approach is validated on a multi-agent simulator.

Towards self-ware via swarm-array computing

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The work reported in this paper proposes Swarm-Array computing, a novel technique inspired by swarm robotics, and built on the foundations of autonomic and parallel computing. The approach aims to apply autonomic computing constructs to parallel computing systems and in effect achieve the self-ware objectives that describe self-managing systems. The constitution of swarm-array computing comprising four constituents, namely the computing system, the problem/task, the swarm and the landscape is considered. Approaches that bind these constituents together are proposed. Space applications employing FPGAs are identified as a potential area for applying swarm-array computing for building reliable systems. The feasibility of a proposed approach is validated on the SeSAm multi-agent simulator and landscapes are generated using the MATLAB toolkit.

Design of highly parallel architecture with Alpha and Handel

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose a bridge between two important parallel programming paradigms: data parallelism and communicating sequential processes (CSP). Data parallel pipelined architectures obtained with the Alpha language can be embedded in a control intensive application expressed in CSP-based Handel formalism. The interface is formally defined from the semantics of the languages Alpha and Handel. This work will ease the design of compute intensive applications on FPGAs.

Compositional technique for synthesising multi-phase regular arrays

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We describe a high-level design method to synthesize multi-phase regular arrays. The method is based on deriving component designs using classical regular (or systolic) array synthesis techniques and composing these separately evolved component design into a unified global design. Similarity transformations ar e applied to component designs in the composition stage in order to align data ow between the phases of the computations. Three transformations are considered: rotation, re ection and translation. The technique is aimed at the design of hardware components for high-throughput embedded systems applications and we demonstrate this by deriving a multi-phase regular array for the 2-D DCT algorithm which is widely used in many vide ocommunications applications.

«
1
2
...
59
60
61
62
63
64
65
66
67
»