Various load-balancing issues in structured overlays

There are numerous types of load-imbalances that can occur in distributed systems in general, and structured overlays in particular.
  • Skewed distribution of keys over key-space, skewed access load, peer heterogeneity.

We have been looking into several of these issues, particularly in the context of the P-Grid and Oscar overlay networks.

Storage and replication balancing

The application of structured overlay networks to implement index structures for data-oriented applications such as peer-to-peer databases or peer-to-peer information retrieval, requires highly efficient approaches for overlay construction, as changing application requirements frequently lead to re-indexing of the data and hence (re-)construction of overlay networks. We describe an approach for the efficient construction of data-oriented, structured overlay networks from scratch in a self-organized way. Standard maintenance algorithms for overlay networks cannot accomplish this efficiently, as they are inherently sequential. Our proposed algorithm is completely decentralized, parallel, and can construct a new overlay network with short latency. At the same time it ensures good load-balancing for skewed data key distributions which result from preserving key order relationships as necessitated by data-oriented applications. We provide both a theoretical analysis of the basic algorithms and a complete system implementation that has been tested on PlanetLab.


Publications: Starting from early heuristics [technical report], the idea matured into overlay construction [VLDB 2005] and maintenance [Self-* 2005] algorithms based on fundamental understanding of the involved dynamics.

Caching and query-load balancing

Queries for different items typically has a skewed distribution over the key space. In this work we identify the optimal replication strategy (number of replicas as well as placement) for a large family of structured overlays to balance query-forwarding and answering load, and also expose the limitations and drawbacks of stand-alone caching strategies, without changing the routing mechanism itself. We present our results in the DBISP2P 2006/CCGrid 2007 papers where we see how the routing redundancy (usually necessary for resilience) can also be used for query-load balancing. Currently we are working on ideas to further improve the quality of query loads (forwarding and answering) balancing, as well as integrate the algorithms in the P-Grid implementation.

Beyond load-balancing: When heterogeneity is the norm!

Deployed (unstructured) P2P systems exhibit wide heterogeneity in both availability and willingness of the resources individual autonomous peers are willing to provide. System designers need to explicitly take into account such heterogeneity issues, and instead design systems which make as good use of available resources as possible. Our ongoing work on the Oscar system is aimed to address these issues.

p.s. Orthogonal to this, we envision use of rewards and/or punishments to make peers contribute resources to the system and achieve some kind of load-balancing. We have not looked into such encouragement/enforcement issues so far.

Last updated: 25th June 2006
details
PhD. thesis

Overlays

P2P Storage

Social Networks

  • Implicit networks
  • P2P Social Softwares
  • Collaboration

Etc