Healing replicas in a software component replication system


Autoria(s): Alves, André Nunes Gomes
Contribuinte(s)

Preguiça, Nuno

Data(s)

10/02/2014

10/02/2014

2013

Resumo

Dissertação para obtenção do Grau de Mestre em Engenharia Informática

Replication is a key technique for improving performance, availability and faulttolerance of systems. Replicated systems exist in different settings – from large georeplicated cloud systems, to replicated databases running in multi-core machines. One feature that it is often important is a mechanism to verify that replica contents continue in-sync, despite any problem that may occur – e.g. silent bugs that corrupt service state. Traditional techniques for summarizing service state require that the internal service state is exactly the same after executing the same set of operation. However, for many applications this does not occur, especially if operations are allowed to execute in different orders or if different implementations are used in different replicas. In this work we propose a new approach for summarizing and recovering the state of a replicated service. Our approach is based on a novel data structure, Scalable Counting Bloom Filter. This data structure combines the ideas in Counting Bloom Filters and Scalable Bloom Filters to create a Bloom Filter variant that allow both delete operation and the size of the structure to grow, thus adapting to size of any service state. We propose an approach to use this data structure to summarize the state of a replicated service, while allowing concurrent operations to execute. We further propose a strategy to recover replicas in a replicated system and describe how to implement our proposed solution in two in-memory databases: H2 and HSQL. The results of evaluation show that our approach can compute the same summary when executing the same set of operation in both databases, thus allowing our solution to be used in diverse replication scenarios. Results also show that additional work on performance optimization is necessary to make our solution practical.

Identificador

http://hdl.handle.net/10362/11353

Idioma(s)

eng

Publicador

Faculdade de Ciências e Tecnologia

Direitos

openAccess

Palavras-Chave #Component replication #Fault-tolerance #Performance #Multi-core processor
Tipo

masterThesis