Improving the Robustness of Distributed Failure Detectors in Adverse Conditions


Autoria(s): Lemos, Fernando Tarla Cardoso; Sato, Liria Matsumoto
Contribuinte(s)

UNIVERSIDADE DE SÃO PAULO

Data(s)

14/10/2013

14/10/2013

2012

Resumo

Failure detection is at the core of most fault tolerance strategies, but it often depends on reliable communication. We present new algorithms for failure detectors which are appropriate as components of a fault tolerance system that can be deployed in situations of adverse network conditions (such as loosely connected and administered computing grids). It packs redundancy into heartbeat messages, thereby improving on the robustness of the traditional protocols. Results from experimental tests conducted in a simulated environment with adverse network conditions show significant improvement over existing solutions.

Identificador

IEEE LATIN AMERICA TRANSACTIONS, PISCATAWAY, v. 10, n. 1, supl. 1, Part 1, pp. 1364-1369, JAN, 2012

1548-0992

http://www.producao.usp.br/handle/BDPI/34424

10.1109/TLA.2012.6142485

http://dx.doi.org/10.1109/TLA.2012.6142485

Idioma(s)

por

Publicador

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

PISCATAWAY

Relação

IEEE LATIN AMERICA TRANSACTIONS

Direitos

closedAccess

Copyright IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Palavras-Chave #FAULT TOLERANCE #FAILURE DETECTION #DISTRIBUTED FAILURE DETECTORS #COMPUTER SCIENCE, INFORMATION SYSTEMS #ENGINEERING, ELECTRICAL & ELECTRONIC
Tipo

article

original article

publishedVersion