Distributed Graph Coloring Algorithm Based on Pregel Model

doi:10.3778/j.issn.1673-9418.1709036

Abstract

Abstract: The graph coloring problem is one of the most famous and classical research questions in the field of computer science and mathematics. With the increasing of data scale, the performance of graph coloring algorithms is limited. And existing distributed graph coloring algorithms are mostly based on shared-memory message passing model. However, the development of Pregel model that has a share-nothing architecture has enhanced the data processing capability, and it has been the key technology for large-scale graph-data processing. But there is no related work to improve the existing distributed graph coloring algorithms to adapt share-nothing Pregel model and make an algorithm research and experimental comparison. In order to improve the performance of graph coloring algorithms, inspired by the classical graph coloring algorithm MIS (maximal-independent-set), this paper devises a distributed graph coloring algorithm MIS-Pregel based on the Pregel model. Then, this paper proposes two strategies to optimize the time for coloring and total number of colors, the first optimization strategy is based on the JP algorithm, and the second optimization strategy is based on the LDF algorithm. This paper implements the basic algorithm MIS-Pregel and two optimized algorithms (JP-Pregel and LDF-Pregel) based on above optimization strategies on Spark GraphX. Finally, extensive experiments show that the proposed basic algorithm has high efficiency of coloring and the performance of the optimization algorithms is improved by 26.4% and 30.9% than the basic algorithm over both synthetic and real datasets.

Key words: distributed graph coloring, Pregel model, Spark, GraphX

摘要： 图着色问题一直是计算机科学和数学领域最著名和经典的研究问题之一。由于目前图数据规模的不断增加，单机图着色算法性能受到限制。现有的分布式图着色算法大多基于共享内存的消息传递模型，而无共享Pregel计算模型的提出与发展提高了大规模图数据的处理能力，其已成为现今大数据处理的主流框架之一，但尚缺少将现有的分布式图着色算法适配到Pregel模型进行算法研究与实验比较的工作。为了提高图着色算法的性能，受经典图着色算法MIS（maximal-independent-set）启发，设计了一种基于Pregel模型的分布式图着色算法MIS-Pregel。结合着色时间和所需颜色数等方面提出了两种不同的优化策略，第一种优化策略基于JP算法，第二种优化策略基于LDF算法。在实现了主流图数据处理模型Pregel的Spark GraphX框架下开发了上述MIS-Pregel算法和两种改进算法JP-Pregel和LDF-Pregel。在合成数据集和真实数据集上进行了实验，大量实验结果表明所提分布式图着色算法能够高效地完成图着色任务，且JP-Pregel算法和LDF-Pregel算法的着色时间比MIS-Pregel算法分别平均缩短了26.4%和30.9%。

关键词: 分布式图着色, Pregel模型, Spark, GraphX

GAN Ying, WANG Xin, FENG Zhiyong, YANG Yajun. Distributed Graph Coloring Algorithm Based on Pregel Model[J]. Journal of Frontiers of Computer Science and Technology, 2018, 12(6): 886-897.

甘瀛，王鑫，冯志勇，杨雅君. 基于Pregel模型的分布式图着色算法[J]. 计算机科学与探索, 2018, 12(6): 886-897.

[1]	WANG Yonggui, XU Shanshan, XIAO Chenglong. Research on Wireless City Community Detection: Using Improved Association Rules to Achieve Community Detection Algorithm on Spark [J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(9): 1582-1592.
[2]	GUO Yuhan, HU Fangxia. Modeling and Solving for Long-Term Car Pooling Problem Considering Matching Feasibility [J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(11): 1894-1910.
[3]	ZHANG Xiaolin, HE Xiaoyu, ZHANG Huanxiang, LI Zhuolin. PLRD-(k,m):Distributed k-Degree-m-Label Anonymity with Protecting Link Rela-tionships [J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(1): 70-82.
[4]	QIU Hui, ZOU Zhaonian. SPARQL Query Processing Algorithm on Spark GraphX [J]. Journal of Frontiers of Computer Science and Technology, 2018, 12(9): 1361-1371.
[5]	LI Yong, TENG Fei, HUANG Qichuan, LI Tianrui. Parallel Time Series Decomposition Algorithm Based on Spark [J]. Journal of Frontiers of Computer Science and Technology, 2018, 12(7): 1055-1063.
[6]	ZHANG Yunfei, LI Jin, YUE Kun, LUO Zhihao, LIU Weiyi. Influence Maximization Methods of Correlated Information Propagation [J]. Journal of Frontiers of Computer Science and Technology, 2018, 12(12): 1891-1902.
[7]	SHI Shengle, ZHAO Yuhai, LI Yuan, YIN Ying, WANG Guoren. Efficient GraphX-Based Distributed Structural Graph Clustering Algorithm [J]. Journal of Frontiers of Computer Science and Technology, 2018, 12(10): 1571-1582.
[8]	DENG Shizhuo, XIN Junchang, NIE Tiezheng, WANG Guoren. Big Data Similarity Join Processing Based on Prefix-Suffix Filtering [J]. Journal of Frontiers of Computer Science and Technology, 2017, 11(8): 1235-1245.
[9]	HAN Chao, DUAN Lei, DENG Song, WANG Huifeng, TANG Changjie. Evaluation of Sequential Data Quality Using Spark [J]. Journal of Frontiers of Computer Science and Technology, 2017, 11(6): 897-907.
[10]	WANG Wen, ZHAO Kankan, LI Cuiping, CHEN Hong, SUN Hui. Feature Extension and Category Research for Short Text Based on Spark Platform [J]. Journal of Frontiers of Computer Science and Technology, 2017, 11(5): 732-741.
[11]	WANG Ze'ao, WU Bin, WU Xinyu, ZHANG Zixing. Research and Implementation of Framework for Large-Scale Multi-Dimensional Network Analysis [J]. Journal of Frontiers of Computer Science and Technology, 2017, 11(12): 1941-1952.
[12]	FANG Feng, CAI Zhiping, ZHAO Qijia, LIN Jiarun, ZHU Ming. Adaptive Technique for Real-Time DDoS Detection and Defense Using Spark Streaming [J]. Journal of Frontiers of Computer Science and Technology, 2016, 10(5): 601-611.
[13]	LIU Zhiqiang, GU Rong, YUAN Chunfeng, HUANG Yihua. Parallelization of Classification Algorithms Based on SparkR [J]. Journal of Frontiers of Computer Science and Technology, 2015, 9(11): 1281-1294.

Distributed Graph Coloring Algorithm Based on Pregel Model

基于Pregel模型的分布式图着色算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 13

Recommended Articles

Metrics