Abstract
In this paper, we consider the problem of data migration in large scale storage systems, where a massive set of data needs to be transferred within a set of disks. The objective is to schedule the data transmissions in the fewest possible time steps. There are two major settings in the study of data migration: homogeneous case - every disk can only engage in one transmission at one time; and heterogeneous case - each disk has a transfer capacity which represents the number the transmissions the disk can engage simultaneously. Most of the literature assumes each disk has unlimited space, which is not realistic in practice. Our paper is the first to consider space constraints in the heterogeneous setting. Each disk v has a total space max{d in v , d out v }+1, where d in v and d out v are the number of data to be received and, respectively, sent by v. We propose four different heuristic algorithms and also formulate the problem as an integer program (IP) that yields an optimal schedule. We conduct simulations to compare the performance of these algorithms, including the optimal result from the integer program when the instance is small enough for the IP computation to be feasible.