894 resultados para HPC parallel computer architecture queues fault tolerance programmability ADAM


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The K-Means algorithm for cluster analysis is one of the most influential and popular data mining methods. Its straightforward parallel formulation is well suited for distributed memory systems with reliable interconnection networks, such as massively parallel processors and clusters of workstations. However, in large-scale geographically distributed systems the straightforward parallel algorithm can be rendered useless by a single communication failure or high latency in communication paths. The lack of scalable and fault tolerant global communication and synchronisation methods in large-scale systems has hindered the adoption of the K-Means algorithm for applications in large networked systems such as wireless sensor networks, peer-to-peer systems and mobile ad hoc networks. This work proposes a fully distributed K-Means algorithm (EpidemicK-Means) which does not require global communication and is intrinsically fault tolerant. The proposed distributed K-Means algorithm provides a clustering solution which can approximate the solution of an ideal centralised algorithm over the aggregated data as closely as desired. A comparative performance analysis is carried out against the state of the art sampling methods and shows that the proposed method overcomes the limitations of the sampling-based approaches for skewed clusters distributions. The experimental analysis confirms that the proposed algorithm is very accurate and fault tolerant under unreliable network conditions (message loss and node failures) and is suitable for asynchronous networks of very large and extreme scale.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In a world where massive amounts of data are recorded on a large scale we need data mining technologies to gain knowledge from the data in a reasonable time. The Top Down Induction of Decision Trees (TDIDT) algorithm is a very widely used technology to predict the classification of newly recorded data. However alternative technologies have been derived that often produce better rules but do not scale well on large datasets. Such an alternative to TDIDT is the PrismTCS algorithm. PrismTCS performs particularly well on noisy data but does not scale well on large datasets. In this paper we introduce Prism and investigate its scaling behaviour. We describe how we improved the scalability of the serial version of Prism and investigate its limitations. We then describe our work to overcome these limitations by developing a framework to parallelise algorithms of the Prism family and similar algorithms. We also present the scale up results of a first prototype implementation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Distributed Rule Induction (DRI) project at the University of Portsmouth is concerned with distributed data mining algorithms for automatically generating rules of all kinds. In this paper we present a system architecture and its implementation for inducing modular classification rules in parallel in a local area network using a distributed blackboard system. We present initial results of a prototype implementation based on the Prism algorithm.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In a world where data is captured on a large scale the major challenge for data mining algorithms is to be able to scale up to large datasets. There are two main approaches to inducing classification rules, one is the divide and conquer approach, also known as the top down induction of decision trees; the other approach is called the separate and conquer approach. A considerable amount of work has been done on scaling up the divide and conquer approach. However, very little work has been conducted on scaling up the separate and conquer approach.In this work we describe a parallel framework that allows the parallelisation of a certain family of separate and conquer algorithms, the Prism family. Parallelisation helps the Prism family of algorithms to harvest additional computer resources in a network of computers in order to make the induction of classification rules scale better on large datasets. Our framework also incorporates a pre-pruning facility for parallel Prism algorithms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

How can organizations use digital infrastructure to realise physical outcomes? The design and construction of London Heathrow Terminal 5 is analysed to build new theoretical understanding of visualization and materialization practices in the transition from digital design to physical realisation. In the project studied, an integrated software solution is introduced as an infrastructure for delivery. The analyses articulate the work done to maintain this digital infrastructure and also to move designs beyond the closed world of the computer to a physical reality. In changing medium, engineers use heterogeneous trials to interrogate and address the limitations of an integrated digital model. The paper explains why such trials, which involve the reconciliation of digital and physical data through parallel and iterative forms of work, provide a robust practice for realizing goals that have physical outcomes. It argues that this practice is temporally different from, and at times in conflict with, building a comprehensive dataset within the digital medium. The paper concludes by discussing the implications for organizations that use digital infrastructures in seeking to accomplish goals in digital and physical media.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Java is becoming an increasingly popular language for developing distributed and parallel scientific and engineering applications. Jini is a Java-based infrastructure developed by Sun that can allegedly provide all the services necessary to support distributed applications. It is the aim of this paper to explore and investigate the services and properties that Jini actually provides and match these against the needs of high performance distributed and parallel applications written in Java. The motivation for this work is the need to develop a distributed infrastructure to support an MPI-like interface to Java known as MPJ. In the first part of the paper we discuss the needs of MPJ, the parallel environment that we wish to support. In particular we look at aspects such as reliability and ease of use. We then move on to sketch out the Jini architecture and review the components and services that Jini provides. In the third part of the paper we critically explore a Jini infrastructure that could be used to support MPJ. Here we are particularly concerned with Jini's ability to support reliably a cocoon of MPJ processes executing in a heterogeneous envirnoment. In the final part of the paper we summarise our findings and report on future work being undertaken on Jini and MPJ.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A parallel pipelined array of cells suitable for real-time computation of histograms is proposed. The cell architecture builds on previous work obtained via C-slow retiming techniques and can be clocked at 65 percent faster frequency than previous arrays. The new arrays can be exploited for higher throughput particularly when dual data rate sampling techniques are used to operate on single streams of data from image sensors. In this way, the new cell operates on a p-bit data bus which is more convenient for interfacing to camera sensors or to microprocessors in consumer digital cameras.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The complexity of current and emerging high performance architectures provides users with options about how best to use the available resources, but makes predicting performance challenging. In this work a benchmark-driven performance modelling approach is outlined that is appro- priate for modern multicore architectures. The approach is demonstrated by constructing a model of a simple shallow water code on a Cray XE6 system, from application-specific benchmarks that illustrate precisely how architectural char- acteristics impact performance. The model is found to recre- ate observed scaling behaviour up to 16K cores, and used to predict optimal rank-core affinity strategies, exemplifying the type of problem such a model can be used for.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The use of virtualization in high-performance computing (HPC) has been suggested as a means to provide tailored services and added functionality that many users expect from full-featured Linux cluster environments. The use of virtual machines in HPC can offer several benefits, but maintaining performance is a crucial factor. In some instances the performance criteria are placed above the isolation properties. This selective relaxation of isolation for performance is an important characteristic when considering resilience for HPC environments that employ virtualization. In this paper we consider some of the factors associated with balancing performance and isolation in configurations that employ virtual machines. In this context, we propose a classification of errors based on the concept of “error zones”, as well as a detailed analysis of the trade-offs between resilience and performance based on the level of isolation provided by virtualization solutions. Finally, a set of experiments are performed using different virtualization solutions to elucidate the discussion.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The three-dimensional molecular dynamics simulation method has been used to study the dynamic responses of an electrorheological (ER) fluid in oscillatory shear. The structure and related viscoelastic behaviour of the fluid are found to be sensitive to the amplitude of the strain. With the increase of the strain amplitude, the structure formed by the particles changes from isolated columns to sheet-like structures which may be perpendicular or parallel to the oscillating direction. Along with the structure evolution, the field-induced moduli decrease significantly with an increase in strain amplitude. The viscoelastic behaviour of the structures obtained in the cases of different strain amplitudes was examined in the linear response regime and an evident structure dependence of the moduli was found. The reason for this lies in the anisotropy of the arrangement of the particles in these structures. Short-range interactions between the particles cannot be neglected in determining the viscoelastic behaviour of ER fluids at small strain amplitude, especially for parallel sheets. The simulation results were compared with available experimental data and good agreement was reached for most of them.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper is about the use of natural language to communicate with computers. Most researches that have pursued this goal consider only requests expressed in English. A way to facilitate the use of several languages in natural language systems is by using an interlingua. An interlingua is an intermediary representation for natural language information that can be processed by machines. We propose to convert natural language requests into an interlingua [universal networking language (UNL)] and to execute these requests using software components. In order to achieve this goal, we propose OntoMap, an ontology-based architecture to perform the semantic mapping between UNL sentences and software components. OntoMap also performs component search and retrieval based on semantic information formalized in ontologies and rules.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Reusable and evolvable Software Engineering Environments (SEES) are essential to software production and have increasingly become a need. In another perspective, software architectures and reference architectures have played a significant role in determining the success of software systems. In this paper we present a reference architecture for SEEs, named RefASSET, which is based on concepts coming from the aspect-oriented approach. This architecture is specialized to the software testing domain and the development of tools for that domain is discussed. This and other case studies have pointed out that the use of aspects in RefASSET provides a better Separation of Concerns, resulting in reusable and evolvable SEEs. (C) 2011 Elsevier Inc. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In testing from a Finite State Machine (FSM), the generation of test suites which guarantee full fault detection, known as complete test suites, has been a long-standing research topic. In this paper, we present conditions that are sufficient for a test suite to be complete. We demonstrate that the existing conditions are special cases of the proposed ones. An algorithm that checks whether a given test suite is complete is given. The experimental results show that the algorithm can be used for relatively large FSMs and test suites.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A novel cryptography method based on the Lorenz`s attractor chaotic system is presented. The proposed algorithm is secure and fast, making it practical for general use. We introduce the chaotic operation mode, which provides an interaction among the password, message and a chaotic system. It ensures that the algorithm yields a secure codification, even if the nature of the chaotic system is known. The algorithm has been implemented in two versions: one sequential and slow and the other, parallel and fast. Our algorithm assures the integrity of the ciphertext (we know if it has been altered, which is not assured by traditional algorithms) and consequently its authenticity. Numerical experiments are presented, discussed and show the behavior of the method in terms of security and performance. The fast version of the algorithm has a performance comparable to AES, a popular cryptography program used commercially nowadays, but it is more secure, which makes it immediately suitable for general purpose cryptography applications. An internet page has been set up, which enables the readers to test the algorithm and also to try to break into the cipher.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present parallel algorithms on the BSP/CGM model, with p processors, to count and generate all the maximal cliques of a circle graph with n vertices and m edges. To count the number of all the maximal cliques, without actually generating them, our algorithm requires O(log p) communication rounds with O(nm/p) local computation time. We also present an algorithm to generate the first maximal clique in O(log p) communication rounds with O(nm/p) local computation, and to generate each one of the subsequent maximal cliques this algorithm requires O(log p) communication rounds with O(m/p) local computation. The maximal cliques generation algorithm is based on generating all maximal paths in a directed acyclic graph, and we present an algorithm for this problem that uses O(log p) communication rounds with O(m/p) local computation for each maximal path. We also show that the presented algorithms can be extended to the CREW PRAM model.