Estimating data divergence in cloud computing storage systems


Autoria(s): Gonçalves, André Miguel Augusto
Contribuinte(s)

Preguiça, Nuno

Rodrigues, Rodrigo

Data(s)

11/12/2013

11/12/2013

2013

Resumo

Dissertação para obtenção do Grau de Mestre em Engenharia Informática

Many internet services are provided through cloud computing infrastructures that are composed of multiple data centers. To provide high availability and low latency, data is replicated in machines in different data centers, which introduces the complexity of guaranteeing that clients view data consistently. Data stores often opt for a relaxed approach to replication, guaranteeing only eventual consistency, since it improves latency of operations. However, this may lead to replicas having different values for the same data. One solution to control the divergence of data in eventually consistent systems is the usage of metrics that measure how stale data is for a replica. In the past, several algorithms have been proposed to estimate the value of these metrics in a deterministic way. An alternative solution is to rely on probabilistic metrics that estimate divergence with a certain degree of certainty. This relaxes the need to contact all replicas while still providing a relatively accurate measurement. In this work we designed and implemented a solution to estimate the divergence of data in eventually consistent data stores, that scale to many replicas by allowing clientside caching. Measuring the divergence when there is a large number of clients calls for the development of new algorithms that provide probabilistic guarantees. Additionally, unlike previous works, we intend to focus on measuring the divergence relative to a state that can lead to the violation of application invariants.

Partially funded by project PTDC/EIA EIA/108963/2008 and by an ERC Starting Grant, Agreement Number 307732

Identificador

http://hdl.handle.net/10362/10852

Idioma(s)

eng

Publicador

Faculdade de Ciências e Tecnologia

Direitos

openAccess

Palavras-Chave #Cloud computing #Geo-replication #Eventual consistency #Bounded divergence #Probabilistic metrics
Tipo

masterThesis