997 resultados para cluster computing


Relevância:

70.00% 70.00%

Publicador:

Relevância:

70.00% 70.00%

Publicador:

Resumo:

High spatial resolution environmental data gives us a better understanding of the environmental factors affecting plant distributions at fine spatial scales. However, large environmental datasets dramatically increase compute times and output species model size stimulating the need for an alternative computing solution. Cluster computing offers such a solution, by allowing both multiple plant species Environmental Niche Models (ENMs) and individual tiles of high spatial resolution models to be computed concurrently on the same compute cluster. We apply our methodology to a case study of 4,209 species of Mediterranean flora (around 17% of species believed present in the biome). We demonstrate a 16 times speed-up of ENM computation time when 16 CPUs were used on the compute cluster. Our custom Java ‘Merge’ and ‘Downsize’ programs reduce ENM output files sizes by 94%. The median 0.98 test AUC score of species ENMs is aided by various species occurrence data filtering techniques. Finally, by calculating the percentage change of individual grid cell values, we map the projected percentages of plant species vulnerable to climate change in the Mediterranean region between 1950–2000 and 2020.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Recent research efforts of parallel processing on non-dedicated clusters have focused on high execution performance, parallelism management, transparent access to resources, and making clusters easy to use. However, as a collection of independent computers used by multiple users, clusters are susceptible to failure. This paper shows the development of a coordinated checkpointing facility for the GENESIS cluster operating system. This facility was developed by exploiting existing operating system services. High performance and low overheads are achieved by allowing the processes of a parallel application to continue executing during the creation of checkpoints, while maintaining low demands on cluster resources by using coordinated checkpointing.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

With the current popularity of cluster computing systems, it is increasingly important to understand the capabilities and potential performance of various interconnection networks. In this paper, we propose an analytical model for studying the capabilities and potential performance of interconnection networks for multi-cluster systems. The model takes into account stochastic quantities as well as network heterogeneity in bandwidth and latency in each cluster. Also, blocking and non-blocking network architecture model is proposed and are used in performance analysis of the system. The model is validated by constructing a set of simulators to simulate different types of clusters, and by comparing the modeled results with the simulated ones.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper addresses the problem of performance modeling for large-scale heterogeneous distributed systems with emphases on multi-cluster computing systems. Since the overall performance of distributed systems is often depends on the effectiveness of its communication network, the study of the interconnection networks for these systems is very important. Performance modeling is required to avoid poorly chosen components and architectures as well as discovering a serious shortfall during system testing just prior to deployment time. However, the multiplicity of components and associated complexity make performance analysis of distributed computing systems a challenging task. To this end, we present an analytical performance model for the interconnection networks of heterogeneous multi-cluster systems. The analysis is based on a parametric family of fat-trees, the m-port n-tree, and a deterministic routing algorithm, which is proposed in this paper. The model is validated through comprehensive simulation, which demonstrated that the proposed model exhibits a good degree of accuracy for various system organizations and under different working conditions.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The study of interconnection networks is important because the overall performance of a distributed system is often critically hinged on the effectiveness of its interconnection network. In the mean time, the heterogeneity is one of the most important factors of such systems. This paper addresses the problem of interconnection networks performance modeling of large-scale distributed systems with emphases on heterogeneous multi-cluster computing systems. So, we present an analytical model to predict message latency in multi-cluster systems in the presence of cluster size heterogeneity. The model is validated through comprehensive simulation, which demonstrates that the proposed model exhibits a good degree of accuracy for various system organizations and under different working conditions.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The advent of commodity-based high-performance clusters has raised parallel and distributed computing to a new level. However, in order to achieve the best possible performance improvements for large-scale computing problems as well as good resource utilization, efficient resource management and scheduling is required. This paper proposes a new two-level adaptive space-sharing scheduling policy for non-dedicated heterogeneous commodity-based high-performance clusters. Using trace-driven simulation, the performance of the proposed scheduling policy is compared with existing adaptive space-sharing policies. Results of the simulation show that the proposed policy performs substantially better than the existing policies.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This work describes recent extensions to the GPFlow scientific workflow system in development at MQUTeR (www.mquter.qut.edu.au), which facilitate interactive experimentation, automatic lifting of computations from single-case to collection-oriented computation and automatic correlation and synthesis of collections. A GPFlow workflow presents as an acyclic data flow graph, yet provides powerful iteration and collection formation capabilities.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Precision, sophistication and economic factors in many areas of scientific research that demand very high magnitude of compute power is the order of the day. Thus advance research in the area of high performance computing is getting inevitable. The basic principle of sharing and collaborative work by geographically separated computers is known by several names such as metacomputing, scalable computing, cluster computing, internet computing and this has today metamorphosed into a new term known as grid computing. This paper gives an overview of grid computing and compares various grid architectures. We show the role that patterns can play in architecting complex systems, and provide a very pragmatic reference to a set of well-engineered patterns that the practicing developer can apply to crafting his or her own specific applications. We are not aware of pattern-oriented approach being applied to develop and deploy a grid. There are many grid frameworks that are built or are in the process of being functional. All these grids differ in some functionality or the other, though the basic principle over which the grids are built is the same. Despite this there are no standard requirements listed for building a grid. The grid being a very complex system, it is mandatory to have a standard Software Architecture Specification (SAS). We attempt to develop the same for use by any grid user or developer. Specifically, we analyze the grid using an object oriented approach and presenting the architecture using UML. This paper will propose the usage of patterns at all levels (analysis. design and architectural) of the grid development.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Accepted Version

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper describes an autonomics development tool which serves as both a powerful and flexible policy-expression language and a policy-based framework that supports the integration and dynamic composition of several autonomic computing techniques including signal processing, automated trend analysis and utility functions. Each of these technologies has specific advantages and applicability to different types of dynamic adaptation. The AGILE platform enables seamless interoperability of the different technologies to each perform various aspects of self-management within a single application. Self-management behaviour is specified using the policy language semantics to bind the various technologies together as required. Since the policy semantics support run-time re-configuration, the self-management architecture is dynamically composable. The policy language and implementation library have integrated support for self-stabilising behaviour, enabling oscillation and other forms of instability to be handled at the policy level with very little effort on the part of the application developer. Example applications are presented to illustrate the integration of different autonomics techniques, and the achievement of dynamic composition.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

MPJ Express is a thread-safe Java messaging library that provides a full implementation of the mpiJava 1.2 API specification. This specification defines a MPI-like bindings for the Java language. We have implemented two communication devices as part of our library, the first, called niodev is based on the Java New I/O package and the second, called mxdev is based on the Myrinet eXpress library MPJ Express comes with an experimental runtitne, which allows portable bootstrapping of Java Virtual Machines across a cluster or network of computers. In this paper we describe the implementation of MPJ Express. Also, we present a performance comparison against various other C and Java messaging systems. A beta version of MPJ Express was released in September 2005.

Relevância:

60.00% 60.00%

Publicador:

Relevância:

60.00% 60.00%

Publicador:

Resumo:

With the increasing awareness of protein folding disorders, the explosion of genomic information, and the need for efficient ways to predict protein structure, protein folding and unfolding has become a central issue in molecular sciences research. Molecular dynamics computer simulations are increasingly employed to understand the folding and unfolding of proteins. Running protein unfolding simulations is computationally expensive and finding ways to enhance performance is a grid issue on its own. However, more and more groups run such simulations and generate a myriad of data, which raises new challenges in managing and analyzing these data. Because the vast range of proteins researchers want to study and simulate, the computational effort needed to generate data, the large data volumes involved, and the different types of analyses scientists need to perform, it is desirable to provide a public repository allowing researchers to pool and share protein unfolding data. This paper describes efforts to provide a grid-enabled data warehouse for protein unfolding data. We outline the challenge and present first results in the design and implementation of the data warehouse.