Abstract
In this paper, we consider the problem of data migration in large scale storage systems in the homogeneous setting where a set of files needs to be transferred within a set of disks, and each disk can only engage in one transmission at a time. The objective is to schedule the data transmissions in the fewest possible time steps.
When the file sizes are uniform, the problem is equivalent to edge coloring in multigraphs, which has been studied for decades, and many well-known algorithms have been devised. We consider the problem when the file sizes are non-uniform. We propose two algorithms for the problem – Maximal Compatible Sets (MCS) and the Tetris algorithm, and prove that the Tetris algorithm is a 4-approximation algorithm. We also empirically compare the performance of these algorithms against one of the most recent algorithms for the uniform file size problem, the Greedy-Euler-Color algorithm, and find that the Tetris algorithm performs the best in practice for both uniform and non-uniform file size scenarios.