Data Transfer Scheduling for Maximizing Throughput of Big-Data Computing in Cloud Systems

Ruitao Xie, Xiaohua Jia

January, 2018

Abstract

Many big-data computing applications have been deployed in cloud platforms. These applications normally demand concurrent data transfers among computing nodes for parallel processing. It is important to find the best transfer scheduling leading to the least data retrieval time-the maximum throughput in other words. However, the existing methods cannot achieve this, because they ignore link bandwidths and the diversity of data replicas and paths. In this paper, we aim to develop a max-throughput data transfer scheduling to minimize the data retrieval time of applications. Specifically, the problem is formulated into mixed integer programming, and an approximation algorithm is proposed, with its approximation ratio analyzed. The extensive simulations demonstrate that our algorithm can obtain near optimal solutions.

Type

Journal article

Publication

IEEE Transactions on Cloud Computing, vol. 6, no. 1, pp. 87-98, (中科院大类一区期刊)

Cloud Computing; Data Transfer Scheduling

Data Transfer Scheduling for Maximizing Throughput of Big-Data Computing in Cloud Systems

Abstract

Ruitao Xie

Associate Professor