26 resultados para data replication

em Deakin Research Online - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Mobile computing has enabled users to seamlessly access databases even when they are on the move. Mobile computing environments require data management approaches that are able to provide complete and highly available access to shared data at any time from any where. In this paper, we propose a novel replicated data protocol for achieving such goal. The proposed scheme replicates data synchronously over stationary sites based on three dimensional grid structure while objects in mobile sites are asynchronously replicated based on commonly visited sites for each user. This combination allows the proposed protocol to operate with less than full connectivity, to easily adapt to changes in group membership and not require all sites to agree to update data objects at any given time, thus giving the technique flexibility in mobile environments. The proposed replication technique is compared with a baseline replication technique and shown to exhibit high availability, fault tolerance and minimal access times of the data and services, which are very important in an environment with low-quality communication links.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data replication is one of the key components in data grid architecture as it enhances data access and reliability and minimises the cost of data transmission. In this paper, we address the problem of reducing the overheads of the replication mechanisms that drive the data management components of a data grid. We propose an approach that extends the resource broker with policies that factor in user quality of service as well as service costs when replicating and transferring data. A realistic model of the data grid was created to simulate and explore the performance of the proposed policy. The policy displayed an effective means of improving the performance of the grid network traffic and is indicated by the improvement of speed and cost of transfers by brokers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Failures are normal rather than exceptional in the cloud computing environments. To improve system avai-lability, replicating the popular data to multiple suitable locations is an advisable choice, as users can access the data from a nearby site. This is, however, not the case for replicas which must have a fixed number of copies on several locations. How to decide a reasonable number and right locations for replicas has become a challenge in the cloud computing. In this paper, a dynamic data replication strategy is put forward with a brief survey of replication strategy suitable for distributed computing environments. It includes: 1) analyzing and modeling the relationship between system availability and the number of replicas; 2) evaluating and identifying the popular data and triggering a replication operation when the popularity data passes a dynamic threshold; 3) calculating a suitable number of copies to meet a reasonable system byte effective rate requirement and placing replicas among data nodes in a balanced way; 4) designing the dynamic data replication algorithm in a cloud. Experimental results demonstrate the efficiency and effectiveness of the improved system brought by the proposed strategy in a cloud.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data grids have been adopted by many scientific communities that need to share, access, transport, process, and manage geographically distributed large data collections. Data replication is one of the main mechanisms used in data grids whereby identical copies of data are generated and stored at various distributed sites to either improve data access performance or reliability or both. However, when data updates are allowed, it is a great challenge to simultaneously improve performance and reliability while ensuring data consistency of such huge and widely distributed data. In this paper, we address this problem. We propose a new quorum-based data replication protocol with the objectives of minimizing the data update cost, providing high availability and data consistency. We compare the proposed approach with two existing approaches using response time, data consistency, data availability, and communication costs. The results show that the proposed approach performs substantially better than the benchmark approaches.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Data in grid research deals with storage, replication, and management of large data sets in a distributed environment. The all-data-to-all-sites replication schemes, like Read-One Write-All (ROWA) and Tree Grid Structure (TGS), are the popular techniques in grid. However, these techniques have a weakness in data storage capacity and data access times. In this paper, we propose the all-data-to-some-sites scheme called the 'Neighbour Replication on Triangular Grid' (NRTG) technique. The proposed scheme minimises the storage capacity as well as data access time with high update availability. It also tolerates failures such as server and site failures.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Online social networks make it easier for people to find and communicate with other people based on shared interests, values, membership in particular groups, etc. Common social networks such as Facebook and Twitter have hundreds of millions or even billions of users scattered all around the world sharing interconnected data. Users demand low latency access to not only their own data but also theirfriends’ data, often very large, e.g. videos, pictures etc. However, social network service providers have a limited monetary capital to store every piece of data everywhere to minimise users’ data access latency. Geo-distributed cloud services with virtually unlimited capabilities are suitable for large scale social networks data storage in different geographical locations. Key problems including how to optimally store and replicate these huge datasets and how to distribute the requests to different datacenters are addressed in this paper. A novel genetic algorithm-based approach is used to find a near-optimal number of replicas for every user’s data and a near-optimal placement of replicas to minimise monetary cost while satisfying latency requirements for all users. Experiments on a large Facebook dataset demonstrate our technique’s effectiveness in outperforming other representative placement and replication strategies.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The availability of critical services and their data can be significantly increased by replicating them on multiple systems connected with each other, even in the face of system and network failures. In some platforms such as peer-to-peer (P2P) systems, their inherent characteristic mandates the employment of some form of replication to provide acceptable service to their users. However, the problem of how best to replicate data to build highly available peer-to-peer systems is still an open problem. In this paper, we propose an approach to address the data replication problem on P2P systems. The proposed scheme is compared with other techniques and is shown to require less communication cost for an operation as well as provide higher degree of data availability.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The widespread adoption of cluster computing as a high performance computing platform has seen the growth of data intensive scientific, engineering and commercial applications such as digital libraries, climate modeling, computational chemistry, computational fluid dynamics and image repositories. However, I/O subsystem performance has not been keeping pace with processor and memory performance, and is fast becoming the dominant factor in overall system performance.  Thus, parallel I/O has become a necessity in the face of performance improvements in other areas of computing systems. This paper addresses the problem of parallel I/O scheduling on cluster computing systems in the presence of data replication.  We propose two new I/O scheduling algorithms and evaluate the relative performance of the proposed policies against two existing approaches.  Simulation results show that the proposed policies perform substantially better than the baseline policies.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In data-intensive distributed systems, replication is the most widely used approach to offer high data availability, low bandwidth consumption, increased fault-tolerance and improved scalability of the overall system. Replication-based systems implement replica control protocols that enforce a specified semantics of accessing the data. Also, the performance depends on a number of factors, the chief of which is the protocol used to maintain consistency among object replica. In this paper, we propose a new low-cost and high data availability protocol called the box-shaped grid structure for maintaining consistency of replicated data on networked distributed computing systems. We show that the proposed protocol provides high data availability, low communication costs, and increased fault-tolerance as compared to the baseline replica control protocols.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Providing reliable and efficient services are primary goals in designing a web server system. Data replication can be used to improve the reliability of the system. However, mapping mechanism is one of the primary concerns to data replication. In this paper, we propose a mapping mechanism model called enhanced domain name server (E-DNS) that dispatches the user requests through the URL-name to IP-address under Neighbor Replica Distribution Technique (NRDT) to improve the reliability of the system.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Mobile computing has enabled users to seamlessly access databases even when they are on the move. However, in the absence of readily available high-quality communication, users are often forced to operate disconnected from the network. As a result, software applications have to be redesigned to take advantage of this environment while accommodating the new challenges posed by mobility. In particular, there is a need for replication and synchronization services in order to guarantee availability of data and functionality, (including updates) in disconnected mode. To this end we propose a scalable and highly available data replication and management service. The proposed replication technique is compared with a baseline replication technique and shown to exhibit high availability, fault tolerance and minimal access times of the data and services, which are very important in an environment with low-quality communication links.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Recent advances m hardware technologies such as portable
computers and wireless communication networks have led to the emergence of mobile computing systems. Thus, availability and accessibility of the data and services become important issues of mobile computing systems. In this paper, we present a data replication and management scheme tailored for such environments In the proposed scheme data is replicated synchronously over stationary sites while for the mobile network, data is replicated asynchronously based on commonly visited sites for each user. The proposed scheme is compared with other techniques and is shown to require less communication cost for an operation as well as provide higher degree of data availability.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Citizen science involves collaboration between multi-sector agencies and the public to address a natural resource management issue. The Sea Search citizen science programme involves community groups in monitoring and collecting subtidal rocky reef and intertidal rocky shore data in Victorian Marine Protected Areas (MPAs), Australia. In this study we compared volunteer and scientifically collected data and the volunteer motivation for participation in the Sea Search programme. Intertidal rocky shore volunteer-collected data was found to be typically comparable to data collected by scientists for species richness and diversity measures. For subtidal monitoring there was also no significant difference for species richness recorded by scientists and volunteers. However, low statistical power suggest only large changes could be detected due to reduced data replication. Generally volunteers recorded lower species diversity for biological groups compared to scientists, albeit not significant. Species abundance measures for algae species were significantly different between volunteers and scientists. These results suggest difficulty in identification and abundance measurements by volunteers and the need for additional training requirements necessary for surveying algae assemblages. The subtidal monitoring results also highlight the difficulties of collecting data in exposed rocky reef habitats with weather conditions and volunteer diver availability constraining sampling effort. The prime motivation for volunteer participation in Sea Search was to assist with scientific research followed closely by wanting to work close to nature. This study revealed two important themes for volunteer engagement in Sea Search: 1) volunteer training and participation and, 2) usability of volunteer collected data for MPA managers. Volunteer-collected data through the Sea Search citizen science programme has the potential to provide useable data to assist in informed management practices of Victoria’s MPAs, but requires the support and commitment from all partners involved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Data is one of the domains in grid research that deals with the storage, replication, and management of large data sets in a distributed environment. The all-data-to-all sites replication scheme such as read-one write-all and tree grid structure (TGS) are the popular techniques being used for replication and management of data in this domain. However, these techniques have its weaknesses in terms of data storage capacity and also data access times due to some number of sites must ‘agree’ in common to execute certain transactions. In this paper, we propose the all-data-to-some-sites scheme called the neighbor replication on triangular grid (NRTG) technique by considering only neighbors have the replicated data, and thus, minimizes the storage capacity as well as high update availability. Also, the technique tolerates failures such as server failures, site failure or even network partitioning using remote procedure call (RPC).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper analyses update ordering and its impact on the performance of a distributed replication system. We propose a model for update orderings and constraints and develop a number of algorithms for implementing different ordering constraints. A performance study is then carried out to analyse the update-ordering model. We show that our model allows the definition of an ordering constraint on each update operation, and the ordering implementation takes account of detailed inter-operation semantics denoted by commutative operations and causal operations to reduce unnecessary delay and results in a better response time for update requests.