146 resultados para Client-server distributed databases

em CentAUR: Central Archive University of Reading - UK


Relevância:

100.00% 100.00%

Publicador:

Resumo:

There are three key driving forces behind the development of Internet Content Management Systems (CMS) - a desire to manage the explosion of content, a desire to provide structure and meaning to content in order to make it accessible, and a desire to work collaboratively to manipulate content in some meaningful way. Yet the traditional CMS has been unable to meet the latter of these requirements, often failing to provide sufficient tools for collaboration in a distributed context. Peer-to-Peer (P2P) systems are networks in which every node is an equal participant (whether transmitting data, exchanging content, or invoking services) and there is an absence of any centralised administrative or coordinating authorities. P2P systems are inherently more scalable than equivalent client-server implementations as they tend to use resources at the edge of the network much more effectively. This paper details the rationale and design of a P2P middleware for collaborative content management.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Resource monitoring in distributed systems is required to understand the 'health' of the overall system and to help identify particular problems, such as dysfunctional hardware, a faulty, system or application software. Desirable characteristics for monitoring systems are the ability to connect to any number of different types of monitoring agents and to provide different views of the system, based on a client's particular preferences. This paper outlines and discusses the ongoing activities within the GridRM wide-area resource-monitoring project.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Tycho was conceived in 2003 in response to a need by the GridRM [1] resource-monitoring project for a ldquolight-weightrdquo, scalable and easy to use wide-area distributed registry and messaging system. Since Tycho's first release in 2006 a number of modifications have been made to the system to make it easier to use and more flexible. Since its inception, Tycho has been utilised across a number of application domains including widearea resource monitoring, distributed queries across archival databases, providing services for the nodes of a Cray supercomputer, and as a system for transferring multi-terabyte scientific datasets across the Internet. This paper provides an overview of the initial Tycho system, describes a number of applications that utilise Tycho, discusses a number of new utilities, and how the Tycho infrastructure has evolved in response to experience of building applications with it.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

G-Rex is light-weight Java middleware that allows scientific applications deployed on remote computer systems to be launched and controlled as if they are running on the user's own computer. G-Rex is particularly suited to ocean and climate modelling applications because output from the model is transferred back to the user while the run is in progress, which prevents the accumulation of large amounts of data on the remote cluster. The G-Rex server is a RESTful Web application that runs inside a servlet container on the remote system, and the client component is a Java command line program that can easily be incorporated into existing scientific work-flow scripts. The NEMO and POLCOMS ocean models have been deployed as G-Rex services in the NERC Cluster Grid, and G-Rex is the core grid middleware in the GCEP and GCOMS e-science projects.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Compute grids are used widely in many areas of environmental science, but there has been limited uptake of grid computing by the climate modelling community, partly because the characteristics of many climate models make them difficult to use with popular grid middleware systems. In particular, climate models usually produce large volumes of output data, and running them usually involves complicated workflows implemented as shell scripts. For example, NEMO (Smith et al. 2008) is a state-of-the-art ocean model that is used currently for operational ocean forecasting in France, and will soon be used in the UK for both ocean forecasting and climate modelling. On a typical modern cluster, a particular one year global ocean simulation at 1-degree resolution takes about three hours when running on 40 processors, and produces roughly 20 GB of output as 50000 separate files. 50-year simulations are common, during which the model is resubmitted as a new job after each year. Running NEMO relies on a set of complicated shell scripts and command utilities for data pre-processing and post-processing prior to job resubmission. Grid Remote Execution (G-Rex) is a pure Java grid middleware system that allows scientific applications to be deployed as Web services on remote computer systems, and then launched and controlled as if they are running on the user's own computer. Although G-Rex is general purpose middleware it has two key features that make it particularly suitable for remote execution of climate models: (1) Output from the model is transferred back to the user while the run is in progress to prevent it from accumulating on the remote system and to allow the user to monitor the model; (2) The client component is a command-line program that can easily be incorporated into existing model work-flow scripts. G-Rex has a REST (Fielding, 2000) architectural style, which allows client programs to be very simple and lightweight and allows users to interact with model runs using only a basic HTTP client (such as a Web browser or the curl utility) if they wish. This design also allows for new client interfaces to be developed in other programming languages with relatively little effort. The G-Rex server is a standard Web application that runs inside a servlet container such as Apache Tomcat and is therefore easy to install and maintain by system administrators. G-Rex is employed as the middleware for the NERC1 Cluster Grid, a small grid of HPC2 clusters belonging to collaborating NERC research institutes. Currently the NEMO (Smith et al. 2008) and POLCOMS (Holt et al, 2008) ocean models are installed, and there are plans to install the Hadley Centre’s HadCM3 model for use in the decadal climate prediction project GCEP (Haines et al., 2008). The science projects involving NEMO on the Grid have a particular focus on data assimilation (Smith et al. 2008), a technique that involves constraining model simulations with observations. The POLCOMS model will play an important part in the GCOMS project (Holt et al, 2008), which aims to simulate the world’s coastal oceans. A typical use of G-Rex by a scientist to run a climate model on the NERC Cluster Grid proceeds as follows :(1) The scientist prepares input files on his or her local machine. (2) Using information provided by the Grid’s Ganglia3 monitoring system, the scientist selects an appropriate compute resource. (3) The scientist runs the relevant workflow script on his or her local machine. This is unmodified except that calls to run the model (e.g. with “mpirun”) are simply replaced with calls to "GRexRun" (4) The G-Rex middleware automatically handles the uploading of input files to the remote resource, and the downloading of output files back to the user, including their deletion from the remote system, during the run. (5) The scientist monitors the output files, using familiar analysis and visualization tools on his or her own local machine. G-Rex is well suited to climate modelling because it addresses many of the middleware usability issues that have led to limited uptake of grid computing by climate scientists. It is a lightweight, low-impact and easy-to-install solution that is currently designed for use in relatively small grids such as the NERC Cluster Grid. A current topic of research is the use of G-Rex as an easy-to-use front-end to larger-scale Grid resources such as the UK National Grid service.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Soil data and reliable soil maps are imperative for environmental management. conservation and policy. Data from historical point surveys, e.g. experiment site data and farmers fields can serve this purpose. However, legacy soil information is not necessarily collected for spatial analysis and mapping such that the data may not have immediately useful geo-references. Methods are required to utilise these historical soil databases so that we can produce quantitative maps of soil propel-ties to assess spatial and temporal trends but also to assess where future sampling is required. This paper discusses two such databases: the Representative Soil Sampling Scheme which has monitored the agricultural soil in England and Wales from 1969 to 2003 (between 400 and 900 bulked soil samples were taken annually from different agricultural fields); and the former State Chemistry Laboratory, Victoria, Australia where between 1973 and 1994 approximately 80,000 soil samples were submitted for analysis by farmers. Previous statistical analyses have been performed using administrative regions (with sharp boundaries) for both databases, which are largely unrelated to natural features. For a more detailed spatial analysis that call be linked to climate and terrain attributes, gradual variation of these soil properties should be described. Geostatistical techniques such as ordinary kriging are suited to this. This paper describes the format of the databases and initial approaches as to how they can be used for digital soil mapping. For this paper we have selected soil pH to illustrate the analyses for both databases.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In molecular biology, it is often desirable to find common properties in large numbers of drug candidates. One family of methods stems from the data mining community, where algorithms to find frequent graphs have received increasing attention over the past years. However, the computational complexity of the underlying problem and the large amount of data to be explored essentially render sequential algorithms useless. In this paper, we present a distributed approach to the frequent subgraph mining problem to discover interesting patterns in molecular compounds. This problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely, a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiverinitiated load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Institute’s HIV-screening data set, where we were able to show close-to linear speedup in a network of workstations. The proposed approach also allows for dynamic resource aggregation in a non dedicated computational environment. These features make it suitable for large-scale, multi-domain, heterogeneous environments, such as computational grids.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a general Multi-Agent System framework for distributed data mining based on a Peer-to-Peer model. Agent protocols are implemented through message-based asynchronous communication. The framework adopts a dynamic load balancing policy that is particularly suitable for irregular search algorithms. A modular design allows a separation of the general-purpose system protocols and software components from the specific data mining algorithm. The experimental evaluation has been carried out on a parallel frequent subgraph mining algorithm, which has shown good scalability performances.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Facilitating the visual exploration of scientific data has received increasing attention in the past decade or so. Especially in life science related application areas the amount of available data has grown at a breath taking pace. In this paper we describe an approach that allows for visual inspection of large collections of molecular compounds. In contrast to classical visualizations of such spaces we incorporate a specific focus of analysis, for example the outcome of a biological experiment such as high throughout screening results. The presented method uses this experimental data to select molecular fragments of the underlying molecules that have interesting properties and uses the resulting space to generate a two dimensional map based on a singular value decomposition algorithm and a self organizing map. Experiments on real datasets show that the resulting visual landscape groups molecules of similar chemical properties in densely connected regions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recently, two approaches have been introduced that distribute the molecular fragment mining problem. The first approach applies a master/worker topology, the second approach, a completely distributed peer-to-peer system, solves the scalability problem due to the bottleneck at the master node. However, in many real world scenarios the participating computing nodes cannot communicate directly due to administrative policies such as security restrictions. Thus, potential computing power is not accessible to accelerate the mining run. To solve this shortcoming, this work introduces a hierarchical topology of computing resources, which distributes the management over several levels and adapts to the natural structure of those multi-domain architectures. The most important aspect is the load balancing scheme, which has been designed and optimized for the hierarchical structure. The approach allows dynamic aggregation of heterogenous computing resources and is applied to wide area network scenarios.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper focuses on improving computer network management by the adoption of artificial intelligence techniques. A logical inference system has being devised to enable automated isolation, diagnosis, and even repair of network problems, thus enhancing the reliability, performance, and security of networks. We propose a distributed multi-agent architecture for network management, where a logical reasoner acts as an external managing entity capable of directing, coordinating, and stimulating actions in an active management architecture. The active networks technology represents the lower level layer which makes possible the deployment of code which implement teleo-reactive agents, distributed across the whole network. We adopt the Situation Calculus to define a network model and the Reactive Golog language to implement the logical reasoner. An active network management architecture is used by the reasoner to inject and execute operational tasks in the network. The integrated system collects the advantages coming from logical reasoning and network programmability, and provides a powerful system capable of performing high-level management tasks in order to deal with network fault.