"Heterogeneity and Load Balance in Distributed Hash Tables"
Existing solutions to achieve load balancing in DHTs incur a high overhead either in terms of routing state or in terms of load movement generated by nodes arriving or departing the system. In this paper, we propose a set of general techniques and use them to develop a protocol based on Chord, called Y_0, that achieves load balancing with minimal overhead under the typical assumption that the load is uniformly distributed in the identifier space. In particular, we prove that Y_0 can achieve near-optimal load balancing while moving little load to maintain the balance and increasing the size of the routing tables by at most a constant factor.
Using extensive simulations based on real-world and synthetic capacity distributions, we show that Y_0 reduces the load imbalance of Chord from O(log n) to a less than 3.6 without increasing the number of links that a node needs to maintain. In addition, we study the effect of heterogeneity on both DHTs, demonstrating significantly reduced average route length as node capacities become increasingly heterogeneous. For a real-word distribution of node capacities, the route length in Y_0 is asymptotically less than half the route length in the case of a homogeneous system.
This work is with Ion Stoica.