融合图注意力的多特征链接预测算法

doi:10.3778/j.issn.1673-9418.2012092

计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (5): 1096-1106.DOI: 10.3778/j.issn.1673-9418.2012092

融合图注意力的多特征链接预测算法

张雁操, 赵宇海(), 史岚

东北大学计算机科学与工程学院,沈阳 110169

收稿日期:2020-12-24 修回日期:2021-03-12 出版日期:2022-05-01 发布日期:2022-05-19
通讯作者: + E-mail: zhaoyuhai@mail.neu.edu.cn
作者简介:张雁操（1997—）,男,安徽阜阳人,硕士研究生,主要研究方向为机器学习、复杂网络等。
赵宇海（1975—）,男,辽宁鞍山人,博士,教授,博士生导师,CCF高级会员,主要研究方向为机器学习、社交网络分析等。
史岚（1964—）,女,辽宁沈阳人,硕士,副教授,主要研究方向为计算机体系结构、网络信息安全等。
基金资助:
国家自然科学基金(61772124);国家重点研发计划(2018YFB1004402)

Multi-feature Based Link Prediction Algorithm Fusing Graph Attention

ZHANG Yancao, ZHAO Yuhai(), SHI Lan

College of Computer Science and Engineering, Northeastern University, Shenyang 110169, China

Received:2020-12-24 Revised:2021-03-12 Online:2022-05-01 Published:2022-05-19
About author:ZHANG Yancao, born in 1997, M.S. candidate. His research interests include machine learning, complex network, etc.
ZHAO Yuhai, born in 1975, Ph.D., professor, Ph.D. supervisor, senior member of CCF. His research interests include machine learning, social network analysis, etc.
SHI Lan, born in 1964, M.S., associate professor. Her research interests include computer architecture, network information security, etc.
Supported by:
National Natural Science Foundation of China(61772124);National Key Research and Development Program of China(2018YFB1004402)

摘要/Abstract

摘要：

链接预测是复杂网络中重要的研究方向之一。利用神经网络学习预定义的启发式特征近年来受到广泛关注。但是目前此类方法主要利用目标链接的局部子图预测链接,具有较强的局部性。针对这一问题,在SEAL算法的基础上,提出了利用多特征融合图注意力进行链接预测的算法ADNSL。该模型支持多类型的节点嵌入特征作为输入,包括局部特征生成和全局特征提取两部分。对于局部特征生成模块,利用图卷积层,将局部子图中的节点特征交互融合。为了弥补SEAL中的特征无效性和节点无偏性,提出了双向无参注意力。在全局特征提取模块中,利用迭代公式生成聚合图以降低struc2vec节点嵌入算法的复杂度,进而从全局角度挖掘可解释的结构特征,可以有效提升链接预测算法性能。实验表明,ADNSL算法可以合理地利用多类型节点嵌入特征,在八个不同领域的真实数据集上的表现明显优于多个基准算法。

关键词: 链接预测, 节点嵌入, 图神经网络, 注意力机制, 聚合图

Abstract:

Link prediction is one of the important research directions in complex networks. Using neural networks to learn predefined heuristic features has attracted extensive attention in recent years. However, at present, this kind of method mainly uses the local subgraph of the target link to predict the link, which has strong locality. To solve this problem, based on SEAL (learning from subgraphs, embeddings and attributes for link prediction) algorithm, this paper proposes an algorithm ADNSL (learning from double-radius node labeling, node2vec and struc2vec for link prediction with attention mechanism) for link prediction using multi-feature fusing graph attention, which supports multi-type node embedding features as inputs, including local feature generation and global feature extraction. For the local feature generation module, the node features in the local subgraph are interactively fused by using the graph convolution layer. In order to make up for the feature ineffectiveness and node unbiasedness in SEAL, a bidirectional parameterless attention is proposed. The global feature extraction module uses iterative formula to generate agglomerative graph to reduce the complexity of struc2vec node embedding algorithm, and then mines interpretable structural features from a global perspective, which can effectively improve the performance of link prediction algorithm. Experiments show that ADNSL algorithm can reasonably utilize multi-type node embedding features, and its performance is obviously better than that of multiple benchmark algorithms across eight real datasets in different fields.

Key words: link prediction, node embedding, graph neural network, attention mechanism, agglomerative graph

中图分类号:

TP391

张雁操, 赵宇海, 史岚. 融合图注意力的多特征链接预测算法[J]. 计算机科学与探索, 2022, 16(5): 1096-1106.

ZHANG Yancao, ZHAO Yuhai, SHI Lan. Multi-feature Based Link Prediction Algorithm Fusing Graph Attention[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(5): 1096-1106.

图/表 10

图1 ADNSL框架

Fig.1 Framework of ADNSL

图2 局部特征生成模块

Fig.2 Local feature generation module

图3 DRNL特征无效性

Fig.3 Ineffectivity of DRNL features

图4 softmax前图注意力权重示意图

Fig.4 Graph attention weights before softmax

图5 全局特征提取模块

Fig.5 Global feature extraction module

图6 聚合图

Fig.6 Agglomerative graph

表1 实验数据集

Table 1 Experiment datasets

数据集	$\| V \|$	$\| E \|$	平均度
USAir	332	2 126	12.81
NS	1 589	2 742	3.45
PB	1 222	16 714	27.36
Yeast	2 375	11 693	9.85
C.ele	297	2 148	14.46
Power	4 941	6 594	2.67
Router	5 022	6 258	2.49
E.coli	1 805	14 660	12.55

表1 实验数据集

Table 1 Experiment datasets

数据集	$\| V \|$	$\| E \|$	平均度
USAir	332	2 126	12.81
NS	1 589	2 742	3.45
PB	1 222	16 714	27.36
Yeast	2 375	11 693	9.85
C.ele	297	2 148	14.46
Power	4 941	6 594	2.67
Router	5 022	6 258	2.49
E.coli	1 805	14 660	12.55

表2 模型AUC对比（50%训练链接）

Table 2 AUC comparison of different approaches (50% training links) %

Model	USAir	NS	PB	Yeast	C.ele	Power	Router	E.coli
CN	87.94	77.14	86.73	82.58	72.24	53.37	52.91	86.57
JA	84.80	77.12	83.44	82.50	69.76	53.37	52.92	81.72
PA	87.57	65.83	89.54	81.59	73.79	46.80	55.05	90.81
AA	88.60	77.13	87.07	82.61	73.36	53.37	52.93	87.64
PR	90.59	82.29	92.24	89.33	84.95	57.35	54.41	92.95
SR	81.05	81.59	81.81	88.52	76.05	56.17	54.37	73.73
LINE	72.21	65.97	75.52	79.53	59.55	53.46	62.42	74.41
N2V	84.63	80.28	79.27	90.17	75.56	55.40	62.46	84.71
WLNM	91.44	87.60	90.91	92.23	75.70	64.11	86.12	92.79
SEAL	92.86	86.76	93.79	93.88	83.87	62.47	85.96	94.28
ASEAL	93.74	92.66	93.73	94.75	81.54	65.34	92.31	95.05
DNSL	94.16	94.13	94.05	95.11	85.06	79.30	91.76	95.96
ADNSL	94.87	96.13	94.07	96.41	84.31	79.83	95.80	96.58

表3 模型AUC对比（90%训练链接）

Table 3 AUC comparison of different approaches (90% training links) %

Model	USAir	NS	PB	Yeast	C.ele	Power	Router	E.coli
CN	93.80	94.42	92.03	89.36	85.03	58.88	56.40	93.72
JA	89.89	94.43	87.39	89.32	80.09	58.88	56.41	81.28
PA	88.94	68.35	90.14	82.30	74.89	44.23	47.51	91.81
AA	95.05	94.43	92.34	89.44	86.84	58.89	56.41	95.32
PR	94.64	94.89	93.53	92.74	90.22	66.40	38.66	95.56
SR	78.39	94.09	77.28	91.49	77.02	76.21	37.47	62.51
LINE	81.17	80.64	76.97	87.35	69.28	55.61	67.13	82.40
N2V	91.42	91.51	85.78	93.66	84.16	76.21	65.49	90.83
WLNM	95.88	98.57	93.46	95.61	86.08	84.77	94.41	97.19
SEAL	96.47	97.30	95.07	97.26	88.59	85.46	94.97	97.57
ASEAL	96.17	98.97	95.52	97.58	88.33	86.11	95.19	97.83
DNSL	97.12	98.07	95.21	98.32	87.79	92.29	98.41	98.60
ADNSL	96.66	99.25	95.69	98.42	89.72	94.57	98.65	98.58

图7 收敛性分析（50%训练链接）

Fig.7 Convergence analysis (50% training links)

参考文献 25

[1]	李兰茜. 基于复杂网络结构的链路预测技术研究[D]. 北京: 北京邮电大学, 2019.
	LI L X. Link prediction based on the structure of complex networks[D]. Beijing: Beijing University of Posts and Tele-communications, 2019.
[2]	MARTÍNEZ V, BERZAL F, CUBERO J. A survey of link prediction in complex networks[J]. ACM Computing Surveys, 2016, 49(4): 1-33.
[3]	LIBEN-NOWELL D, KLEINBERG J. The link-prediction problem for social networks[J]. Journal of the American So-ciety for Information Science & Technology, 2007, 58(7): 1019-1031.
[4]	JACCARD P. Etude comparative de la distribution florale dans une portion des Alpes et des Jura[J]. Bulletin del la Societe Vaudoise des Sciences Naturelles, 1901, 37(142): 547-579.
[5]	BARABÁSI A, ALBERT R. Emergence of scaling in random networks[J]. Science, 1999, 286(5439): 509-512. DOI URL
[6]	ADAMIC L A, ADAR E. Friends and neighbors on the web[J]. Social Networks, 2003, 25(3): 211-230. DOI URL
[7]	ZHOU T, LÜ L Y, ZHANG Y C. Predicting missing links via local information[J]. European Physical Journal B, 2009, 71(4): 623-630. DOI URL
[8]	KATZ L. A new status index derived from sociometric ana-lysis[J]. Psychometrika, 1953, 18(1): 39-43. DOI URL
[9]	JEH G, WIDOM J. SimRank: a measure of structural-context similarity[C]// Proceedings of the 8th ACM SIGKDD Inter-national Conference on Knowledge Discovery and Data Mining, Edmonton, Jul 23-26, 2002. New York: ACM, 2002: 538-543.
[10]	LIU W P, LÜ L Y. Link prediction based on local random walk[J]. Europhysics Letters, 2010, 89(5): 58007. DOI URL
[11]	KOVÁCS I A, LUCK K, SPIROHN K, et al. Network-based prediction of protein interactions[J]. bioRxiv, 2018: 275529.
[12]	PEROZZI B, AL-RFOU R, SKIENA S. DeepWalk: online learning of social representations[C]// Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, Aug 24-27, 2014. New York: ACM, 2014: 701-710.
[13]	TANG J, QU M, WANG M Z, et al. LINE: large-scale infor-mation network embedding[C]// Proceedings of the 24th Inter-national Conference on World Wide Web, Florence, May 18-22, 2015. New York: ACM, 2015: 1067-1077.
[14]	GROVER A, LESKOVEC J. node2vec:scalable feature lear-ning for networks[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, Aug 13-17, 2016. New York: ACM, 2016: 855-864.
[15]	RIBEIRO L F R, SAVARESE P H P, FIGUEIREDO D R. struc2vec:learning node representations from structural iden-tity[C]// Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halfiax, Aug 13-17, 2017. New York: ACM, 2017: 385-394.
[16]	WANG X, CUI P, WANG J, et al. Community preserving network embedding[C]// Proceedings of the 31st AAAI Con-ference on Artificial Intelligence, San Francisco, Feb 4-9, 2017. Palo Alto: AAAI, 2017: 203-209.
[17]	LI Y, WANG Y, ZHANG T T, et al. Learning network em-bedding with community structural information[C]// Procee-dings of the 28th International Joint Conference on Artificial Intelligence, Macao, China, Aug 10-16, 2019. San Francisco: Morgan Kaufmann, 2019: 2937-2943.
[18]	LAI Y A, HSU C C, CHEN W H, et al. PRUNE: preserving proximity and global ranking for network embedding[C]// Proceedings of the 31st Conference on Neural Information Processing Systems, Long Beach, Dec 4-9, 2017. Cambridge: MIT Press, 2017: 5257-5266.
[19]	LICHTENWALTER R, LUSSIER J T, CHAWLA N V. New perspectives and methods in link prediction[C]// Proceedings of the 16th ACM SIGKDD International Conference on Know-ledge Discovery and Data Mining, Washington, Jul 25-28, 2010. New York: ACM, 2010: 243-252.
[20]	CUKIERSKI W, HAMNER B, YANG B. Graph-based fea-tures for supervised link prediction[C]// Proceedings of the 2011 International Joint Conference on Neural Networks, San Jose, Jul 31-Aug 5, 2011. Piscataway: IEEE, 2011: 1237-1244.
[21]	ZHANG M H, CHEN Y X. Weisfeiler-Lehman neural ma-chine for link prediction[C]// Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halfiax, Aug 13-17, 2017. New York: ACM, 2017: 575-583.
[22]	ZHANG M H, CHEN Y X. Link prediction based on graph neural networks[C]// Proceedings of the 32nd Conference on Neural Information Processing Systems, Montreal, Dec 2-8, 2018. Cambridge: MIT Press, 2018: 5171-5181.
[23]	ZHANG M H, CUI Z C, NEUMANN M, et al. An end-to-end deep learning architecture for graph classification[C]// Proceedings of the 32nd AAAI Conference on Artificial Intel-ligence, New Orleans, Feb 2-7, 2018. Palo Alto: AAAI, 2018: 4438-4445.
[24]	VELICKOVIC P, CUCURULL G, CASANOVA A, et al. Graph attention networks[C]// Proceedings of the 6th Inter-national Conference on Learning Representations, Vancouver, Apr 30-May 3, 2018: 1-12.
[25]	XU G L, WANG X K, WANG Y, et al. Edge-nodes represe-ntation neural machine for link prediction[J]. Algorithms, 2019, 12(1): 12. DOI URL

融合图注意力的多特征链接预测算法

Multi-feature Based Link Prediction Algorithm Fusing Graph Attention

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 10

参考文献 25

相关文章 15

编辑推荐

Metrics

[1]	杨知桥, 张莹, 王新杰, 张东波, 王玉. 改进U型网络在视网膜病变检测中的应用研究[J]. 计算机科学与探索, 2022, 16(8): 1877-1884.
[2]	于慧琳, 陈炜, 王琪, 高建伟, 万怀宇. 使用子图推理实现知识图谱关系预测[J]. 计算机科学与探索, 2022, 16(8): 1800-1808.
[3]	夏鸿斌, 肖奕飞, 刘渊. 融合自注意力机制的长文本生成对抗网络模型[J]. 计算机科学与探索, 2022, 16(7): 1603-1610.
[4]	李运寰, 闻继伟, 彭力. 高帧率的轻量级孪生网络目标跟踪[J]. 计算机科学与探索, 2022, 16(6): 1405-1416.
[5]	陆仲达, 张春达, 张佳奇, 王子菲, 许军华. 双分支网络的苹果叶部病害识别[J]. 计算机科学与探索, 2022, 16(4): 917-926.
[6]	黄思远, 赵宇海, 梁燚铭. 融合图嵌入和注意力机制的代码搜索[J]. 计算机科学与探索, 2022, 16(4): 844-854.
[7]	赵鹏飞, 谢林柏, 彭力. 融合注意力机制的深层次小目标检测算法[J]. 计算机科学与探索, 2022, 16(4): 927-937.
[8]	王燕妮, 余丽仙. 注意力与多尺度有效融合的SSD目标检测算法[J]. 计算机科学与探索, 2022, 16(2): 438-447.
[9]	蒋光峰, 胡鹏程, 叶桦, 仰燕兰. 基于重构误差的同构图分类模型[J]. 计算机科学与探索, 2022, 16(1): 185-193.
[10]	袁立宁, 李欣, 王晓冬, 刘钊. 图嵌入模型综述[J]. 计算机科学与探索, 2022, 16(1): 59-87.
[11]	吴正洋, 汤庸, 刘海. 个性化学习推荐研究综述[J]. 计算机科学与探索, 2022, 16(1): 21-40.
[12]	蔡明昕, 孙晶, 王斌. 多角度语义轨迹相似度计算模型[J]. 计算机科学与探索, 2021, 15(9): 1632-1640.
[13]	陈俊芬, 张明, 赵佳成, 谢博鋆, 李艳. 结合降噪和自注意力的深度聚类算法[J]. 计算机科学与探索, 2021, 15(9): 1717-1727.
[14]	张晗, 贾甜远, 骆方, 张生, 邬霞. 面向网络文本的BERT心理特质预测研究[J]. 计算机科学与探索, 2021, 15(8): 1459-1468.
[15]	陈剑南, 杜军平, 薛哲, 寇菲菲. 基于多重注意力的金融事件大数据精准画像[J]. 计算机科学与探索, 2021, 15(7): 1237-1244.