Transparent and autonomic rollback-recovery in cluster systems


Autoria(s): Maloney, Andrew; Goscinski, Andrzej
Contribuinte(s)

Hobbs, Michael

Xiang, Yang

Zhou, Wanlei

Data(s)

01/01/2008

Resumo

Cluster systems provide an excellent environment to run computation hungry applications. However, due to being created using commodity components they are prone to failures. To overcome these failures we propose to use rollback-recovery, which consists of the checkpointing and recovery facilities. Checkpointing facilities have been the focus of many previous studies; however, the recovery facilities have been overlooked. This paper focuses on the requirements, concept and architecture of recovery facilities. The synthesized fault tolerant system was implemented in the GENESIS system and evaluated. The results show that the synthesized system is efficient and scalable.<br />

Identificador

http://hdl.handle.net/10536/DRO/DU:30018277

Idioma(s)

eng

Publicador

IEEE Computer Society

Relação

http://dro.deakin.edu.au/eserv/DU:30018277/goscinski-transparentandautomatic-2008.pdf

http://dx.doi.org/10.1109/ICPADS.2008.117

Direitos

2008, IEEE

Palavras-Chave #cluster systems #fault tolerance #rollback-recovery
Tipo

Conference Paper