Biblioteca Digital

188 resultados para Parallel Algorithms

em CentAUR: Central Archive University of Reading - UK

Tools for regularizing Array Designs

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper makes a contribution in bridging the theory and practice of the polyhedral model for designing parallel algorithms. Although the theory of polyhedral model is well developed, designers of massively parallel algorithms are unable to benefit from the theory due to the lack of software tools that incorporate the wide range of transformations that are possible in the model. The Uniformization tool that we developed was the first to integrate a number of techniques and to completely automate the transformation step allowing designers to explore a wide range of feasible designs from high-level specifications.

On the double-vertex-cycle-connectivity of crossed cubes

Relevância:

60.00% 60.00%

Publicador:

Pancyclicity of Mobius cubes with faulty nodes

Relevância:

60.00% 60.00%

Publicador:

Resumo:

An interconnection network with n nodes is four-pancyclic if it contains a cycle of length l for each integer l with 4 <= l <= n. An interconnection network is fault-tolerant four-pancyclic if the surviving network is four-pancyclic in the presence of faults. The fault-tolerant four-pancyclicity of interconnection networks is a desired property because many classical parallel algorithms can be mapped onto such networks in a communication-efficient fashion, even in the presence of failing nodes or edges. Due to some attractive properties as compared with its hypercube counterpart of the same size, the Mobius cube has been proposed as a promising candidate for interconnection topology. Hsieh and Chen [S.Y. Hsieh, C.H. Chen, Pancyclicity on Mobius cubes with maximal edge faults, Parallel Computing, 30(3) (2004) 407-421.] showed that an n-dimensional Mobius cube is four-pancyclic in the presence of up to n-2 faulty edges. In this paper, we show that an n-dimensional Mobius cube is four-pancyclic in the presence of up to n-2 faulty nodes. The obtained result is optimal in that, if n-1 nodes are removed, the surviving network may not be four-pancyclic. (C) 2005 Elsevier B.V. All rights reserved.

Convergence analysis of stochastic diffusion search

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper we present a connectionist searching technique - the Stochastic Diffusion Search (SDS), capable of rapidly locating a specified pattern in a noisy search space. In operation SDS finds the position of the pre-specified pattern or if it does not exist - its best instantiation in the search space. This is achieved via parallel exploration of the whole search space by an ensemble of agents searching in a competitive cooperative manner. We prove mathematically the convergence of stochastic diffusion search. SDS converges to a statistical equilibrium when it locates the best instantiation of the object in the search space. Experiments presented in this paper indicate the high robustness of SDS and show good scalability with problem size. The convergence characteristic of SDS makes it a fully adaptive algorithm and suggests applications in dynamically changing environments.

Improved program performance using a cluster of workstations

Relevância:

60.00% 60.00%

Publicador:

PDM 2006: Proceedings of the International Workshop on Data Mining

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Recently major processor manufacturers have announced a dramatic shift in their paradigm to increase computing power over the coming years. Instead of focusing on faster clock speeds and more powerful single core CPUs, the trend clearly goes towards multi core systems. This will also result in a paradigm shift for the development of algorithms for computationally expensive tasks, such as data mining applications. Obviously, work on parallel algorithms is not new per se but concentrated efforts in the many application domains are still missing. Multi-core systems, but also clusters of workstations and even large-scale distributed computing infrastructures provide new opportunities and pose new challenges for the design of parallel and distributed algorithms. Since data mining and machine learning systems rely on high performance computing systems, research on the corresponding algorithms must be on the forefront of parallel algorithm research in order to keep pushing data mining and machine learning applications to be more powerful and, especially for the former, interactive. To bring together researchers and practitioners working in this exciting field, a workshop on parallel data mining was organized as part of PKDD/ECML 2006 (Berlin, Germany). The six contributions selected for the program describe various aspects of data mining and machine learning approaches featuring low to high degrees of parallelism: The first contribution focuses the classic problem of distributed association rule mining and focuses on communication efficiency to improve the state of the art. After this a parallelization technique for speeding up decision tree construction by means of thread-level parallelism for shared memory systems is presented. The next paper discusses the design of a parallel approach for dis- tributed memory systems of the frequent subgraphs mining problem. This approach is based on a hierarchical communication topology to solve issues related to multi-domain computational envi- ronments. The forth paper describes the combined use and the customization of software packages to facilitate a top down parallelism in the tuning of Support Vector Machines (SVM) and the next contribution presents an interesting idea concerning parallel training of Conditional Random Fields (CRFs) and motivates their use in labeling sequential data. The last contribution finally focuses on very efficient feature selection. It describes a parallel algorithm for feature selection from random subsets. Selecting the papers included in this volume would not have been possible without the help of an international Program Committee that has provided detailed reviews for each paper. We would like to also thank Matthew Otey who helped with publicity for the workshop.

Parallel genetic algorithms for the tuning of a fuzzy AQM controller

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents the results of the application of a parallel Genetic Algorithm (GA) in order to design a Fuzzy Proportional Integral (FPI) controller for active queue management on Internet routers. The Active Queue Management (AQM) policies are those policies of router queue management that allow the detection of network congestion, the notification of such occurrences to the hosts on the network borders, and the adoption of a suitable control policy. Two different parallel implementations of the genetic algorithm are adopted to determine an optimal configuration of the FPI controller parameters. Finally, the results of several experiments carried out on a forty nodes cluster of workstations are presented.

Parallel Hybrid Monte Carlo Algorithms for Matrix Computations

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper we consider hybrid (fast stochastic approximation and deterministic refinement) algorithms for Matrix Inversion (MI) and Solving Systems of Linear Equations (SLAE). Monte Carlo methods are used for the stochastic approximation, since it is known that they are very efficient in finding a quick rough approximation of the element or a row of the inverse matrix or finding a component of the solution vector. We show how the stochastic approximation of the MI can be combined with a deterministic refinement procedure to obtain MI with the required precision and further solve the SLAE using MI. We employ a splitting A = D – C of a given non-singular matrix A, where D is a diagonal dominant matrix and matrix C is a diagonal matrix. In our algorithm for solving SLAE and MI different choices of D can be considered in order to control the norm of matrix T = D –1C, of the resulting SLAE and to minimize the number of the Markov Chains required to reach given precision. Further we run the algorithms on a mini-Grid and investigate their efficiency depending on the granularity. Corresponding experimental results are presented.

Parallel Monte Carlo algorithms for information retrieval

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In any data mining applications, automated text and text and image retrieval of information is needed. This becomes essential with the growth of the Internet and digital libraries. Our approach is based on the latent semantic indexing (LSI) and the corresponding term-by-document matrix suggested by Berry and his co-authors. Instead of using deterministic methods to find the required number of first "k" singular triplets, we propose a stochastic approach. First, we use Monte Carlo method to sample and to build much smaller size term-by-document matrix (e.g. we build k x k matrix) from where we then find the first "k" triplets using standard deterministic methods. Second, we investigate how we can reduce the problem to finding the "k"-largest eigenvalues using parallel Monte Carlo methods. We apply these methods to the initial matrix and also to the reduced one. The algorithms are running on a cluster of workstations under MPI and results of the experiments arising in textual retrieval of Web documents as well as comparison of the stochastic methods proposed are presented. (C) 2003 IMACS. Published by Elsevier Science B.V. All rights reserved.

Neurofuzzy mixture of experts network parallel learning and model construction algorithms

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A connection between a fuzzy neural network model with the mixture of experts network (MEN) modelling approach is established. Based on this linkage, two new neuro-fuzzy MEN construction algorithms are proposed to overcome the curse of dimensionality that is inherent in the majority of associative memory networks and/or other rule based systems. The first construction algorithm employs a function selection manager module in an MEN system. The second construction algorithm is based on a new parallel learning algorithm in which each model rule is trained independently, for which the parameter convergence property of the new learning method is established. As with the first approach, an expert selection criterion is utilised in this algorithm. These two construction methods are equivalent in their effectiveness in overcoming the curse of dimensionality by reducing the dimensionality of the regression vector, but the latter has the additional computational advantage of parallel processing. The proposed algorithms are analysed for effectiveness followed by numerical examples to illustrate their efficacy for some difficult data based modelling problems.

A parallel genetic algorithm for the Steiner Problem in Networks

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a parallel genetic algorithm to the Steiner Problem in Networks. Several previous papers have proposed the adoption of GAs and others metaheuristics to solve the SPN demonstrating the validity of their approaches. This work differs from them for two main reasons: the dimension and the characteristics of the networks adopted in the experiments and the aim from which it has been originated. The reason that aimed this work was namely to build a comparison term for validating deterministic and computationally inexpensive algorithms which can be used in practical engineering applications, such as the multicast transmission in the Internet. On the other hand, the large dimensions of our sample networks require the adoption of a parallel implementation of the Steiner GA, which is able to deal with such large problem instances.

Implementing a generic systolic array for genetic algorithms

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We have designed a highly parallel design for a simple genetic algorithm using a pipeline of systolic arrays. The systolic design provides high throughput and unidirectional pipelining by exploiting the implicit parallelism in the genetic operators. The design is significant because, unlike other hardware genetic algorithms, it is independent of both the fitness function and the particular chromosome length used in a problem. We have designed and simulated a version of the mutation array using Xilinix FPGA tools to investigate the feasibility of hardware implementation. A simple 5-chromosome mutation array occupies 195 CLBs and is capable of performing more than one million mutations per second. I. Introduction Genetic algorithms (GAs) are established search and optimization techniques which have been applied to a range of engineering and applied problems with considerable success [1]. They operate by maintaining a population of trial solutions encoded, using a suitable encoding scheme.

Systolic random number generation for genetic algorithms

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A parallel hardware random number generator for use with a VLSI genetic algorithm processing device is proposed. The design uses an systolic array of mixed congruential random number generators. The generators are constantly reseeded with the outputs of the proceeding generators to avoid significant biasing of the randomness of the array which would result in longer times for the algorithm to converge to a solution. 1 Introduction In recent years there has been a growing interest in developing hardware genetic algorithm devices [1, 2, 3]. A genetic algorithm (GA) is a stochastic search and optimization technique which attempts to capture the power of natural selection by evolving a population of candidate solutions by a process of selection and reproduction [4]. In keeping with the evolutionary analogy, the solutions are called chromosomes with each chromosome containing a number of genes. Chromosomes are commonly simple binary strings, the bits being the genes.

Mapping a generic systolic array for genetic algorithms onto FPGAs: theory and practice

Relevância:

30.00% 30.00%

Publicador:

Linear predicted two-pass hexagonal algorithm with parallel implementation for motion estimation

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a paralleled Two-Pass Hexagonal (TPA) algorithm constituted by Linear Hashtable Motion Estimation Algorithm (LHMEA) and Hexagonal Search (HEXBS) for motion estimation. In the TPA., Motion Vectors (MV) are generated from the first-pass LHMEA and are used as predictors for second-pass HEXBS motion estimation, which only searches a small number of Macroblocks (MBs). We introduced hashtable into video processing and completed parallel implementation. We propose and evaluate parallel implementations of the LHMEA of TPA on clusters of workstations for real time video compression. It discusses how parallel video coding on load balanced multiprocessor systems can help, especially on motion estimation. The effect of load balancing for improved performance is discussed. The performance or the algorithm is evaluated by using standard video sequences and the results are compared to current algorithms.

«
1
2
3
4
5
6
7
8
...
12
13
»