Biblioteca Digital

957 resultados para parallel applications

Parallel induction of modular classification rules

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Distributed Rule Induction (DRI) project at the University of Portsmouth is concerned with distributed data mining algorithms for automatically generating rules of all kinds. In this paper we present a system architecture and its implementation for inducing modular classification rules in parallel in a local area network using a distributed blackboard system. We present initial results of a prototype implementation based on the Prism algorithm.

Parallel random prism: a computationally efficient ensemble learner for classification

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Generally classifiers tend to overfit if there is noise in the training data or there are missing values. Ensemble learning methods are often used to improve a classifier's classification accuracy. Most ensemble learning approaches aim to improve the classification accuracy of decision trees. However, alternative classifiers to decision trees exist. The recently developed Random Prism ensemble learner for classification aims to improve an alternative classification rule induction approach, the Prism family of algorithms, which addresses some of the limitations of decision trees. However, Random Prism suffers like any ensemble learner from a high computational overhead due to replication of the data and the induction of multiple base classifiers. Hence even modest sized datasets may impose a computational challenge to ensemble learners such as Random Prism. Parallelism is often used to scale up algorithms to deal with large datasets. This paper investigates parallelisation for Random Prism, implements a prototype and evaluates it empirically using a Hadoop computing cluster.

Non-uniform data distribution for communication-efficient parallel clustering

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Global communicationrequirements andloadimbalanceof someparalleldataminingalgorithms arethe major obstacles to exploitthe computational power of large-scale systems. This work investigates how non-uniform data distributions can be exploited to remove the global communication requirement and to reduce the communication costin parallel data mining algorithms and, in particular, in the k-means algorithm for cluster analysis. In the straightforward parallel formulation of the k-means algorithm, data and computation loads are uniformly distributed over the processing nodes. This approach has excellent load balancing characteristics that may suggest it could scale up to large and extreme-scale parallel computing systems. However, at each iteration step the algorithm requires a global reduction operationwhichhinders thescalabilityoftheapproach.Thisworkstudiesadifferentparallelformulation of the algorithm where the requirement of global communication is removed, while maintaining the same deterministic nature ofthe centralised algorithm. The proposed approach exploits a non-uniform data distribution which can be either found in real-world distributed applications or can be induced by means ofmulti-dimensional binary searchtrees. The approachcanalso be extended to accommodate an approximation error which allows a further reduction ofthe communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing element

Dynamic group communication for large-scale parallel data mining

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Exascale systems are the next frontier in high-performance computing and are expected to deliver a performance of the order of 10^18 operations per second using massive multicore processors. Very large- and extreme-scale parallel systems pose critical algorithmic challenges, especially related to concurrency, locality and the need to avoid global communication patterns. This work investigates a novel protocol for dynamic group communication that can be used to remove the global communication requirement and to reduce the communication cost in parallel formulations of iterative data mining algorithms. The protocol is used to provide a communication-efficient parallel formulation of the k-means algorithm for cluster analysis. The approach is based on a collective communication operation for dynamic groups of processes and exploits non-uniform data distributions. Non-uniform data distributions can be either found in real-world distributed applications or induced by means of multidimensional binary search trees. The analysis of the proposed dynamic group communication protocol has shown that it does not introduce significant communication overhead. The parallel clustering algorithm has also been extended to accommodate an approximation error, which allows a further reduction of the communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing elements.

Efficient group communication for large-scale parallel clustering

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Global communication requirements and load imbalance of some parallel data mining algorithms are the major obstacles to exploit the computational power of large-scale systems. This work investigates how non-uniform data distributions can be exploited to remove the global communication requirement and to reduce the communication cost in iterative parallel data mining algorithms. In particular, the analysis focuses on one of the most influential and popular data mining methods, the k-means algorithm for cluster analysis. The straightforward parallel formulation of the k-means algorithm requires a global reduction operation at each iteration step, which hinders its scalability. This work studies a different parallel formulation of the algorithm where the requirement of global communication can be relaxed while still providing the exact solution of the centralised k-means algorithm. The proposed approach exploits a non-uniform data distribution which can be either found in real world distributed applications or can be induced by means of multi-dimensional binary search trees. The approach can also be extended to accommodate an approximation error which allows a further reduction of the communication costs.

Towards a parallel computationally efficient approach to scaling up data stream classification

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Advances in hardware technologies allow to capture and process data in real-time and the resulting high throughput data streams require novel data mining approaches. The research area of Data Stream Mining (DSM) is developing data mining algorithms that allow us to analyse these continuous streams of data in real-time. The creation and real-time adaption of classification models from data streams is one of the most challenging DSM tasks. Current classifiers for streaming data address this problem by using incremental learning algorithms. However, even so these algorithms are fast, they are challenged by high velocity data streams, where data instances are incoming at a fast rate. This is problematic if the applications desire that there is no or only a very little delay between changes in the patterns of the stream and absorption of these patterns by the classifier. Problems of scalability to Big Data of traditional data mining algorithms for static (non streaming) datasets have been addressed through the development of parallel classifiers. However, there is very little work on the parallelisation of data stream classification techniques. In this paper we investigate K-Nearest Neighbours (KNN) as the basis for a real-time adaptive and parallel methodology for scalable data stream classification tasks.

A parallel formulation for the simulation of a generic branch predictor

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A parallel formulation for the simulation of a branch prediction algorithm is presented. This parallel formulation identifies independent tasks in the algorithm which can be executed concurrently. The parallel implementation is based on the multithreading model and two parallel programming platforms: pthreads and Cilk++. Improvement in execution performance by up to 7 times is observed for a generic 2-bit predictor in a 12-core multiprocessor system.

FAST, PARALLEL AND SECURE CRYPTOGRAPHY ALGORITHM USING LORENZ`S ATTRACTOR

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A novel cryptography method based on the Lorenz`s attractor chaotic system is presented. The proposed algorithm is secure and fast, making it practical for general use. We introduce the chaotic operation mode, which provides an interaction among the password, message and a chaotic system. It ensures that the algorithm yields a secure codification, even if the nature of the chaotic system is known. The algorithm has been implemented in two versions: one sequential and slow and the other, parallel and fast. Our algorithm assures the integrity of the ciphertext (we know if it has been altered, which is not assured by traditional algorithms) and consequently its authenticity. Numerical experiments are presented, discussed and show the behavior of the method in terms of security and performance. The fast version of the algorithm has a performance comparable to AES, a popular cryptography program used commercially nowadays, but it is more secure, which makes it immediately suitable for general purpose cryptography applications. An internet page has been set up, which enables the readers to test the algorithm and also to try to break into the cipher.

PARALLEL ALGORITHMS FOR MAXIMAL CLIQUES IN CIRCLE GRAPHS AND UNRESTRICTED DEPTH SEARCH

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present parallel algorithms on the BSP/CGM model, with p processors, to count and generate all the maximal cliques of a circle graph with n vertices and m edges. To count the number of all the maximal cliques, without actually generating them, our algorithm requires O(log p) communication rounds with O(nm/p) local computation time. We also present an algorithm to generate the first maximal clique in O(log p) communication rounds with O(nm/p) local computation, and to generate each one of the subsequent maximal cliques this algorithm requires O(log p) communication rounds with O(m/p) local computation. The maximal cliques generation algorithm is based on generating all maximal paths in a directed acyclic graph, and we present an algorithm for this problem that uses O(log p) communication rounds with O(m/p) local computation for each maximal path. We also show that the presented algorithms can be extended to the CREW PRAM model.

Lie algebra methods for the applications to the statistical theory of turbulence

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Approximate Lie symmetries of the Navier-Stokes equations are used for the applications to scaling phenomenon arising in turbulence. In particular, we show that the Lie symmetries of the Euler equations are inherited by the Navier-Stokes equations in the form of approximate symmetries that allows to involve the Reynolds number dependence into scaling laws. Moreover, the optimal systems of all finite-dimensional Lie subalgebras of the approximate symmetry transformations of the Navier-Stokes are constructed. We show how the scaling groups obtained can be used to introduce the Reynolds number dependence into scaling laws explicitly for stationary parallel turbulent shear flows. This is demonstrated in the framework of a new approach to derive scaling laws based on symmetry analysis [11]-[13].

Alphanumeric LCD infrared control via computer’s parallel port

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The present work will explain a method to achieve a remote controlled (via IR LED) alphanumeric Liquid Crystal Display. In modern times, the remote access of different devices has become quite popular, therefore, the aim of this project is to provide a useful tool that will integrate common and easy to access devices. The system includes a C language based user interface, an assembly language code for the AT89C51ED2 microcontroller instructions and some digital electronic circuits needed for the driving and control of both the LCD and the infrared communication, as well as the PC with a parallel port. The interaction of all the devices provides a whole system that can be helpful in different applications, or it can be separated into each one of its different stages to take the best advantage as possible.

On developing efficient applications using conservative distributed simulation paradigm

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper aims to present, using a set of guidelines, how to apply the conservative distributed simulation paradigm (CMB protocol) to develop efficient applications. Using these guidelines, even a user with little experience on distributed simulation and computer architecture can have good performance on distributed simulations using conservative synchronization protocols for parallel processes.The set of guidelines is focus on a specific application domain, the performance evaluation of computer systems, considering models with coarse granularity and few logical processes and running over two platforms: parallel (high performance communication environment) and distributed (low performance communication environment).

On Clifford subalgebras, spacetime splittings and applications

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Z(2)-gradings of Clifford algebras are reviewed and we shall be concerned with an alpha-grading based on the structure of inner automorphisms, which is closely related to the spacetime splitting, if we consider the standard conjugation map automorphism by an arbitrary, but fixed, splitting vector. After briefly sketching the orthogonal and parallel components of products of differential forms, where we introduce the parallel [orthogonal] part as the space [time] component, we provide a detailed exposition of the Dirac operator splitting and we show how the differential operator parallel and orthogonal components are related to the Lie derivative along the splitting vector and the angular momentum splitting bivector. We also introduce multivectorial-induced alpha-gradings and present the Dirac equation in terms of the spacetime splitting, where the Dirac spinor field is shown to be a direct sum of two quaternions. We point out some possible physical applications of the formalism developed.

Speedup and scalability analysis of Master-Slave applications on large heterogeneous clusters

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Although cluster environments have an enormous potential processing power, real applications that take advantage of this power remain an elusive goal. This is due, in part, to the lack of understanding about the characteristics of the applications best suited for these environments. This paper focuses on Master/Slave applications for large heterogeneous clusters. It defines application, cluster and execution models to derive an analytic expression for the execution time. It defines speedup and derives speedup bounds based on the inherent parallelism of the application and the aggregated computing power of the cluster. The paper derives an analytical expression for efficiency and uses it to define scalability of the algorithm-cluster combination based on the isoefficiency metric. Furthermore, the paper establishes necessary and sufficient conditions for an algorithm-cluster combination to be scalable which are easy to verify and use in practice. Finally, it covers the impact of network contention as the number of processors grow. (C) 2007 Elsevier B.V. All rights reserved.

Single-phase high power factor hybrid rectifier suitable for high-power applications

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

«
1
2
...
5
6
7
8
9
10
11
...
63
64
»