Various load-balancing issues in structured overlays
There are numerous types of load-imbalances that can occur in distributed
systems in general, and structured overlays in particular.Last updated: 25th June 2006
- Skewed distribution of keys over key-space, skewed access load, peer heterogeneity.
We have been looking into several of these issues, particularly in the context of the P-Grid and Oscar overlay networks.
Storage and replication balancing
The application of structured overlay networks to implement index structures for data-oriented applications such as peer-to-peer databases or peer-to-peer information retrieval, requires highly efficient approaches for overlay construction, as changing application requirements frequently lead to re-indexing of the data and hence (re-)construction of overlay networks. We describe an approach for the efficient construction of data-oriented, structured overlay networks from scratch in a self-organized way. Standard maintenance algorithms for overlay networks cannot accomplish this efficiently, as they are inherently sequential. Our proposed algorithm is completely decentralized, parallel, and can construct a new overlay network with short latency. At the same time it ensures good load-balancing for skewed data key distributions which result from preserving key order relationships as necessitated by data-oriented applications. We provide both a theoretical analysis of the basic algorithms and a complete system implementation that has been tested on PlanetLab.
Publications: Starting from early
heuristics [technical report], the idea matured into overlay
construction [VLDB 2005] and
maintenance [Self-* 2005] algorithms based
on fundamental understanding of the involved dynamics.
Caching and query-load balancing
Queries for different items typically has a skewed distribution over the key space. In this work we identify the optimal replication strategy (number of replicas as well as placement) for a large family of structured overlays to balance query-forwarding and answering load, and also expose the limitations and drawbacks of stand-alone caching strategies, without changing the routing mechanism itself. We present our results in the DBISP2P 2006/CCGrid 2007 papers where we see how the routing redundancy (usually necessary for resilience) can also be used for query-load balancing. Currently we are working on ideas to further improve the quality of query loads (forwarding and answering) balancing, as well as integrate the algorithms in the P-Grid implementation.Beyond load-balancing: When heterogeneity is the norm!
Deployed (unstructured) P2P systems exhibit wide heterogeneity in both availability and willingness of the resources individual autonomous peers are willing to provide. System designers need to explicitly take into account such heterogeneity issues, and instead design systems which make as good use of available resources as possible. Our ongoing work on the Oscar system is aimed to address these issues.p.s. Orthogonal to this, we envision use of rewards and/or punishments to make peers contribute resources to the system and achieve some kind of load-balancing. We have not looked into such encouragement/enforcement issues so far.