A distributed algorithm for large-scale graph partitioning
Rahimian F., Payberah AH., Girdzijauskas S., Jelasity M., Haridi S.
© 2015 ACM. Balanced graph partitioning is an NP-complete problem with a wide range of applications. These applications include many large-scale distributed problems, including the optimal storage of large sets of graph-structured data over several hosts. However, in very large-scale distributed scenarios, state-of-the-art algorithms are not directly applicable because they typically involve frequent global operations over the entire graph. In this article, we propose a fully distributed algorithm called Ja-be-Ja that uses local search and simulated annealing techniques for two types of graph partitioning: edge-cut partitioning and vertex-cut partitioning. The algorithm is massively parallel: There is no central coordination, each vertex is processed independently, and only the direct neighbors of a vertex and a small subset of random vertices in the graph need to be known locally. Strict synchronization is not required. These features allow Ja-be-Ja to be easily adapted to any distributed graph-processing system from data centers to fully distributed networks. We show that the minimal edge-cut value empirically achieved by Ja-be-Ja is comparable to state-of-the-art centralized algorithms such as Metis. In particular, on large social networks, Ja-be-Ja outperforms Metis. We also show that Ja-be-Ja computes very low vertex-cuts, which are proved significantly more effective than edge-cuts for processing most real-world graphs.