An online data access prediction and optimization approach for distributed systems
Contribuinte(s) |
UNIVERSIDADE DE SÃO PAULO |
---|---|
Data(s) |
07/11/2013
07/11/2013
2012
|
Resumo |
Current scientific applications have been producing large amounts of data. The processing, handling and analysis of such data require large-scale computing infrastructures such as clusters and grids. In this area, studies aim at improving the performance of data-intensive applications by optimizing data accesses. In order to achieve this goal, distributed storage systems have been considering techniques of data replication, migration, distribution, and access parallelism. However, the main drawback of those studies is that they do not take into account application behavior to perform data access optimization. This limitation motivated this paper which applies strategies to support the online prediction of application behavior in order to optimize data access operations on distributed systems, without requiring any information on past executions. In order to accomplish such a goal, this approach organizes application behaviors as time series and, then, analyzes and classifies those series according to their properties. By knowing properties, the approach selects modeling techniques to represent series and perform predictions, which are, later on, used to optimize data access operations. This new approach was implemented and evaluated using the OptorSim simulator, sponsored by the LHC-CERN project and widely employed by the scientific community. Experiments confirm this new approach reduces application execution time in about 50 percent, specially when handling large amounts of data. FAPESP-Sao Paulo Research Foundation, Brazil [2011/02655-9] CNPq-National Council for Scientific and Technological Development research funding agency [304338/2008-7 and 470739/2008-8] |
Identificador |
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, LOS ALAMITOS, v. 23, n. 6, p. 1017-1029, JUN, 2012 1045-9219 http://www.producao.usp.br/handle/BDPI/43238 10.1109/TPDS.2011.256 |
Idioma(s) |
eng |
Publicador |
IEEE COMPUTER SOC LOS ALAMITOS |
Relação |
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS |
Direitos |
restrictedAccess Copyright IEEE COMPUTER SOC |
Palavras-Chave | #DISTRIBUTED COMPUTING #DISTRIBUTED FILE SYSTEM #DATA ACCESS OPTIMIZATION #TIME SERIES ANALYSIS #PREDICTION #TIME-SERIES #RECURRENCE PLOTS #PACKAGE #GRIDS #COMPUTER SCIENCE, THEORY & METHODS #ENGINEERING, ELECTRICAL & ELECTRONIC |
Tipo |
article original article publishedVersion |