Distributed computing of all-to-all comparison problems in heterogeneous systems


Autoria(s): Zhang, Yi-Fan; Tian, Yu-Chu; Kelly, Wayne; Fidge, Colin J.
Data(s)

09/11/2015

Resumo

The requirement of distributed computing of all-to-all comparison (ATAC) problems in heterogeneous systems is increasingly important in various domains. Though Hadoop-based solutions are widely used, they are inefficient for the ATAC pattern, which is fundamentally different from the MapReduce pattern for which Hadoop is designed. They exhibit poor data locality and unbalanced allocation of comparison tasks, particularly in heterogeneous systems. The results in massive data movement at runtime and ineffective utilization of computing resources, affecting the overall computing performance significantly. To address these problems, a scalable and efficient data and task distribution strategy is presented in this paper for processing large-scale ATAC problems in heterogeneous systems. It not only saves storage space but also achieves load balancing and good data locality for all comparison tasks. Experiments of bioinformatics examples show that about 89\% of the ideal performance capacity of the multiple machines have be achieved through using the approach presented in this paper.

Formato

application/pdf

Identificador

http://eprints.qut.edu.au/87374/

Publicador

IEEE

Relação

http://eprints.qut.edu.au/87374/1/paper_IECON15_V8_Glen_eps_withInfor.pdf

http://iecon2015.com/

Zhang, Yi-Fan, Tian, Yu-Chu, Kelly, Wayne, & Fidge, Colin J. (2015) Distributed computing of all-to-all comparison problems in heterogeneous systems. In Proceedings of the 41st Annual Conference of the IEEE Industrial Electronics Society, IEEE, Yokohama, Japan. (In Press)

Direitos

Copyright 2015 IEEE

Fonte

School of Electrical Engineering & Computer Science; Science & Engineering Faculty

Palavras-Chave #080599 Distributed Computing not elsewhere classified #Big data #distributed computing #all-to-all comparison #data distribution
Tipo

Conference Paper