A general communication cost optimization framework for big data stream processing in geo-distributed data centers
Data(s) |
01/01/2016
|
---|---|
Resumo |
With the explosion of big data, processing large numbers of continuous data streams, i.e., big data stream processing (BDSP), has become a crucial requirement for many scientific and industrial applications in recent years. By offering a pool of computation, communication and storage resources, public clouds, like Amazon's EC2, are undoubtedly the most efficient platforms to meet the ever-growing needs of BDSP. Public cloud service providers usually operate a number of geo-distributed datacenters across the globe. Different datacenter pairs are with different inter-datacenter network costs charged by Internet Service Providers (ISPs). While, inter-datacenter traffic in BDSP constitutes a large portion of a cloud provider's traffic demand over the Internet and incurs substantial communication cost, which may even become the dominant operational expenditure factor. As the datacenter resources are provided in a virtualized way, the virtual machines (VMs) for stream processing tasks can be freely deployed onto any datacenters, provided that the Service Level Agreement (SLA, e.g., quality-of-information) is obeyed. This raises the opportunity, but also a challenge, to explore the inter-datacenter network cost diversities to optimize both VM placement and load balancing towards network cost minimization with guaranteed SLA. In this paper, we first propose a general modeling framework that describes all representative inter-task relationship semantics in BDSP. Based on our novel framework, we then formulate the communication cost minimization problem for BDSP into a mixed-integer linear programming (MILP) problem and prove it to be NP-hard. We then propose a computation-efficient solution based on MILP. The high efficiency of our proposal is validated by extensive simulation based studies. |
Identificador | |
Idioma(s) |
eng |
Publicador |
IEEE |
Relação |
http://dro.deakin.edu.au/eserv/DU:30080752/xiang-ageneralcommunication-2016.pdf http://www.dx.doi.org/10.1109/TC.2015.2417566 |
Direitos |
2016, IEEE |
Palavras-Chave | #Science & Technology #Technology #Computer Science, Hardware & Architecture #Engineering, Electrical & Electronic #Computer Science #Engineering #Big data #stream processing #network cost minimization #VM placement #geo-distributed data centers #VIRTUAL MACHINE PLACEMENT #EFFICIENCY |
Tipo |
Journal Article |