融合节点属性和无环路径的社交网络嵌入方法

doi:10.3778/j.issn.1673-9418.2104075

计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (11): 2505-2518.DOI: 10.3778/j.issn.1673-9418.2104075

融合节点属性和无环路径的社交网络嵌入方法

王本钰, 顾益军⁺(), 彭舒凡

中国人民公安大学信息网络安全学院，北京 100032

收稿日期:2021-04-02 修回日期:2021-05-17 出版日期:2022-11-01 发布日期:2021-05-25
通讯作者: + E-mail: guyijun@ppsuc.edu.cn
作者简介:王本钰（1998—），男，江苏盐城人，硕士研究生，主要研究方向为复杂网络、深度学习。
顾益军（1968—），男，江苏无锡人，博士，教授，博士生导师，主要研究方向为复杂网络、深度学习。
彭舒凡（1998—），男，江苏无锡人，硕士研究生，主要研究方向为复杂网络、深度学习。
基金资助:
公安部科技强警基础工作专项项目(2020GABJC02);中国人民公安大学基本科研业务费项目(2021JKF420)

Social Network Embedding Method Combining Node Attributes and Loop-Free Path

WANG Benyu, GU Yijun⁺(), PENG Shufan

School of Information and Network Security, People’s Public Security University of China, Beijing 100032, China

Received:2021-04-02 Revised:2021-05-17 Online:2022-11-01 Published:2021-05-25
About author:WANG Benyu, born in 1998, M.S. candidate. His research interests include complex network and deep learning.
GU Yijun, born in 1968, Ph.D., professor, Ph.D. supervisor. His research interests include complex network and deep learning.
PENG Shufan, born in 1998, M.S. candidate. His research interests include complex network and deep learning.
Supported by:
Police Project of Science and Technology to Strengthen the Basic Work of Ministry of Public Security of China(2020GABJC02);Basic Research Funds of People’s Public Security University of China(2021JKF420)

摘要/Abstract

摘要：

网络嵌入的目标是学习网络中节点的低维特征表示，将学习到的特征用于网络的各种分析任务中，例如节点分类、链路预测、社区发现和推荐等。现有的网络嵌入方法对于社交网络中高阶结构信息利用不足并且没有考虑社交网络结构信息和属性信息的相关性，应用于社交网络中效果并不理想。为了解决这些问题，提出了一种融合节点属性和无环路径的社交网络嵌入方法（LFNE）。该算法首先基于节点间无环路径计算节点高阶结构相似性，消除环状路径和大度节点对于节点结构相似性的影响，使得网络嵌入方法可以更好地融合社交网络高阶结构信息。然后结合节点间无环路径相似性度量指标计算节点属性相似性，充分利用社交网络结构信息和属性信息的相关性，消除属性信息中存在的噪音。最后融合节点结构相似性和属性相似性应用于堆叠降噪自动编码器中学习节点的低维特征表示。在3个社交网络数据集上与近几年代表性算法进行实验对比，实验结果表明LFNE算法在节点分类和链路预测实验中可以取得相对显著的效果，具有更好的网络嵌入表现。

关键词: 网络嵌入, 社交网络, 节点属性, 无环路径, 堆叠降噪自动编码器

Abstract:

Network embedding’s goal is to learn the low-dimensional node feature representation in the network. The learned features are used in various network analysis tasks, such as node classification, link prediction, community detection and recommendation, etc. The existing network embedding methods do not make full use of high-order structure information in social networks. Moreover, the correlation between structure information and node attribute information is not considered. The effect of these methods applied in the social network is not ideal. A social network embedding method combining loop-free path and attributes network embedding (LFNE) is proposed to solve these problems. The high-order structural similarity of nodes is calculated first based on the loop-free path between nodes to eliminate the influence of loop path and large-degree nodes on node structure similarity. This algorithm makes the network embedding method better integrate the high-order social network structure information. Then the node attributes similarity is calculated by combining the loop-free path similarity measurement index between nodes, and the correlation between social network structure information and attribute information is fully utilized to eliminate the noise in attribute information. Finally, the node structure similarity and attribute similarity are fused and applied to learning the low-dimensional feature representation of nodes in the stacked denoising autoencoder. Comparison of experiments with representative algorithms in recent years on three social network datasets shows that the LFNE algorithm can achieve relatively significant results in node classification and link prediction with better network embedding performance.

Key words: network embedding, social network, node attributes, loop-free path, stacked denoising autoencoder

中图分类号:

TP391

王本钰, 顾益军, 彭舒凡. 融合节点属性和无环路径的社交网络嵌入方法[J]. 计算机科学与探索, 2022, 16(11): 2505-2518.

WANG Benyu, GU Yijun, PENG Shufan. Social Network Embedding Method Combining Node Attributes and Loop-Free Path[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(11): 2505-2518.

图/表 12

参考文献 30

[1]	TU C, YANG C, LIU Z, et al. Network representation learning: an overview[J]. Scientia Sinica Informationis, 2017, 47(8): 980-996. DOI URL
[2]	PEROZZI B, AL-RFOU R, SKIENA S. DeepWalk: online learning of social representations[C]// Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, Aug 24-27, 2014. New York: ACM, 2014: 701-710.
[3]	MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[J]. arXiv:1310.4546, 2013.
[4]	TANG J, QU M, WANG M, et al. Line: large-scale information network embedding[C]// Proceedings of the 24th International Conference on World Wide Web, Florence, May 18-22, 2015. New York: ACM, 2015: 1067-1077.
[5]	CAO S, LU W, XU Q. GraRep: learning graph representations with global structural information[C]// Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, Melbourne, Oct 19-23, 2015. New York: ACM, 2015: 891-900.
[6]	GROVER A, LESKOVEC J. Node2vec: scalable feature lear-ning for networks[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, Aug 24-27, 2016. New York: ACM, 2016: 855-864.
[7]	CAO S, LU W, XU Q. Deep neural networks for learning graph representations[C]// Proceedings of the 2016 AAAI Conference on Artificial Intelligence, Phoenix, Feb 12-17, 2016. Menlo Park: AAAI, 2016: 1145-1152.
[8]	WANG D, CUI P, ZHU W. Structural deep network embedding[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, Aug 13-17, 2016. New York: ACM, 2016: 1225-1234.
[9]	RIBEIRO L F R, SAVERESE P H P, FIGUEIREDO D R. Struc2vec: learning node representations from structural identity[C]// Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, Aug 13-17, 2017. New York: ACM, 2017: 385-394.
[10]	WANG C, WANG C, WANG Z, et al. Edge2vec: edge-based social network embedding[J]. ACM Transactions on Knowledge Discovery from Data, 2020, 14(4): 1-24.
[11]	YANG C, LIU Z, ZHAO D, et al. Network representation learning with rich text information[C]// Proceedings of the 24th International Joint Conference on Artificial Intelligence, Buenos Aires, Jul 25-31, 2015. Menlo Park: AAAI, 2015: 2111-2117.
[12]	TU C, LIU H, LIU Z, et al. CANE: context-aware network embedding for relation modeling[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Jul 30-Aug 4, 2017. Stroudsburg: ACL, 2017: 1722-1731.
[13]	HUANG X, LI J, HU X. Accelerated attributed network embedding[C]// Proceedings of the 2017 SIAM International Conference on Data Mining, Houston, Apr 27-29, 2017. Philadelphia: SIAM, 2017: 633-641.
[14]	BANDYOPADHYAY S, LOKESH N, MURTY M N. Outlier aware network embedding for attributed networks[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence, the 31st Innovative Applications of Artificial Intelligence Conference, the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, Honolulu, Jan 27-Feb 1, 2019. Menlo Park: AAAI, 2019: 12-19.
[15]	HONG R C, HE Y, WU L, et al. Deep attributed network embedding by preserving structure and attribute information[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2021, 51(3): 1434-1445. DOI URL
[16]	NOZZA D, FERSINI E, MESSINA E. CAGE: constrained deep attributed graph embedding[J]. Information Sciences, 2020, 518: 56-70. DOI URL
[17]	NEWMAN M E J. Clustering and preferential attachment in growing networks[J]. Physical Review E, 2001, 64(2): 025102.
[18]	CHOWDHURY G G. Introduction to modern information retrieval[M]. London: Facet Publishing, 2010.
[19]	RAVASZ E, SOMERA A L, MONGRU D A, et al. Hierarchical organization of modularity in metabolic networks[J]. Science, 2002, 297(5586): 1551-1555. DOI URL
[20]	LEICHT E A, HOLME P, NEWMAN M E J. Vertex similarity in networks[J]. Physical Review E, 2006, 73(2): 026120.
[21]	KATZ L. A new status index derived from sociometric analysis[J]. Psychometrika, 1953, 18(1): 39-43. DOI URL
[22]	ZHOU T, LÜ L, ZHANG Y C. Predicting missing links via local information[J]. The European Physical Journal B, 2009, 71(4): 623-630. DOI URL
[23]	CHEN Z, XIE Z, ZHANG Q. Community detection based on local topological information and its application in power grid[J]. Neurocomputing, 2015, 170: 384-392. DOI URL
[24]	WU J, CHENG N, ZHOU C, et al. Computing the number of loop-free k-hop paths of networks[J]. IEEE Transactions on Services Computing, 2022, 15(4): 2114-2128. DOI URL
[25]	LV L Y, JIN C H, ZHOU T. Similarity index based on local paths for link prediction of complex networks[J]. Physical Review E, 2009, 80(4): 046122.
[26]	唐晋韬, 王挺, 王戟. 适合复杂网络分析的最短路径近似算法[J]. 软件学报, 2011, 22(10): 2279-2290.
	TANG J T, WANG T, WANG J. Shortest path approximate algorithm for complex network analysis[J]. Journal of Software, 2011, 22(10): 2279-2290. DOI URL
[27]	LE GALL F. Powers of tensors and fast matrix multiplication[C]// Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation, Kobe, Jul 23-25, 2014. New York: ACM, 2014: 296-303.
[28]	HUANG X, LI J, HU X. Label informed attributed network embedding[C]// Proceedings of the 10th ACM International Conference on Web Search and Data Mining, Cambridge, Feb 6-10, 2017. New York: ACM, 2017: 731-739.
[29]	GAO H, HUANG H. Self-paced network embedding[C]// Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, London, Aug 19-23, 2018. New York: ACM, 2018: 1406-1415.
[30]	FAWCETT T. An introduction to ROC analysis[J]. Pattern Recognition Letters, 2006, 27(8): 861-874. DOI URL

Datasets	Nodes	Edges	Attributes	Label
BlogCatalog	5 196	171 743	8 189	6
Flickr	7 575	239 738	12 047	9
Email	1 005	25 751	42	—

Datasets	Nodes	Edges	Attributes	Label
BlogCatalog	5 196	171 743	8 189	6
Flickr	7 575	239 738	12 047	9
Email	1 005	25 751	42	—

Datasets	Structure of autoencoder
BlogCatalog	5 196-1 024-256
Flickr	7 375-2 048-256
Email-Eu-core	1 005-512-64

Datasets	Structure of autoencoder
BlogCatalog	5 196-1 024-256
Flickr	7 375-2 048-256
Email-Eu-core	1 005-512-64

Model	Ratio of training sample
Model	10%	30%	50%	70%	90%
DeepWalk	0.583 5	0.653 1	0.674 3	0.692 4	0.683 4
node2vec	0.597 1	0.642 3	0.670 1	0.681 7	0.676 6
DNGR	0.666 3	0.674 5	0.702 5	0.707 7	0.712 3
SDNE	0.674 3	0.714 2	0.735 6	0.729 5	0.738 3
TADW	0.796 3	0.829 1	0.832 1	0.821 4	0.865 3
DANE	0.785 2	0.833 4	0.848 1	0.836 6	0.873 3
LFNE	0.836 4	0.878 8	0.887 5	0.869 3	0.873 1

融合节点属性和无环路径的社交网络嵌入方法

Social Network Embedding Method Combining Node Attributes and Loop-Free Path

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 12

参考文献 30

相关文章 15

编辑推荐

Metrics

Model	Ratio of training sample
Model	10%	30%	50%	70%	90%
DeepWalk	0.576 5	0.651 3	0.671 1	0.684 8	0.675 4
node2vec	0.587 3	0.637 4	0.664 2	0.685 6	0.680 1
DNGR	0.669 6	0.671 2	0.693 4	0.701 2	0.718 8
SDNE	0.664 5	0.707 6	0.729 6	0.721 5	0.734 5
TADW	0.799 6	0.814 1	0.823 1	0.810 6	0.857 4
DANE	0.778 7	0.826 9	0.831 4	0.831 9	0.856 5
LFNE	0.829 6	0.870 1	0.883 1	0.860 1	0.867 3

Model	Ratio of training sample
Model	10%	30%	50%	70%	90%
DeepWalk	0.433 1	0.502 1	0.513 3	0.519 6	0.524 3
node2vec	0.418 5	0.479 5	0.510 7	0.518 4	0.525 5
DNGR	0.476 3	0.521 2	0.571 4	0.596 3	0.612 2
SDNE	0.501 4	0.541 7	0.568 4	0.594 7	0.603 3
TADW	0.645 3	0.694 5	0.701 6	0.721 3	0.731 1
DANE	0.654 8	0.706 3	0.719 4	0.734 5	0.741 4
LFNE	0.713 6	0.726 7	0.736 6	0.752 1	0.770 5

Model	Ratio of training sample
Model	10%	30%	50%	70%	90%
DeepWalk	0.431 4	0.497 4	0.508 5	0.516 6	0.521 4
node2vec	0.414 7	0.473 4	0.506 6	0.512 1	0.521 7
DNGR	0.471 1	0.516 7	0.564 1	0.587 4	0.604 5
SDNE	0.497 1	0.536 4	0.561 4	0.587 6	0.594 1
TADW	0.638 4	0.681 4	0.707 7	0.711 5	0.724 5
DANE	0.646 3	0.692 1	0.714 5	0.734 7	0.748 4
LFNE	0.708 5	0.713 4	0.733 1	0.752 4	0.766 3

[1]	王雪纯, 吕晟凯, 吴浩, 何鹏, 曾诚. 多网络混合嵌入学习的服务推荐方法研究[J]. 计算机科学与探索, 2022, 16(7): 1529-1542.
[2]	陈江美, 张文德. 基于位置社交网络的兴趣点推荐系统研究综述[J]. 计算机科学与探索, 2022, 16(7): 1462-1478.
[3]	杨书新, 宋建缤, 梁文. 基于割点的社交网络影响最大化问题[J]. 计算机科学与探索, 2022, 16(6): 1316-1326.
[4]	陈洁, 陈嘉琳, 赵姝, 张燕平. 层次标签引导的属性网络嵌入[J]. 计算机科学与探索, 2021, 15(7): 1279-1288.
[5]	端祥宇, 袁冠, 孟凡荣. 动态社区发现方法研究综述[J]. 计算机科学与探索, 2021, 15(4): 612-630.
[6]	高昂, 梁英, 谢小杰, 王梓森, 李锦涛. 支持隐私保护的社交网络信息传播方法[J]. 计算机科学与探索, 2021, 15(2): 233-248.
[7]	郭奕，徐亮，熊雪军. 社交网络中意见领袖挖掘方法综述[J]. 计算机科学与探索, 2021, 15(11): 2077-2092.
[8]	李娜，朱怀杰，刘威，印鉴. 基于位置的稀疏群体查询[J]. 计算机科学与探索, 2021, 15(11): 2151-2160.
[9]	李想，申德荣，冯朔，寇月，聂铁铮. 结合全局种子最优局部扩展的跨网络用户识别[J]. 计算机科学与探索, 2020, 14(6): 928-938.
[10]	张卜聆，王兴伟，李婕，易波，黄敏. 基于朋友圈和节点感知的内容中心MSN路由机制[J]. 计算机科学与探索, 2020, 14(1): 51-58.
[11]	王瑶，寇月，申德荣，聂铁铮，于戈. 元路径选择和矩阵分解的跨社交网络链路预测[J]. 计算机科学与探索, 2019, 13(9): 1459-1470.
[12]	张昱，高克宁，陈默，于戈. 融合网络结构和节点属性的链接预测方法[J]. 计算机科学与探索, 2019, 13(7): 1094-1101.
[13]	江淼淼，孙更新，宾晟. 多关系社交网络中社团结构发现算法[J]. 计算机科学与探索, 2019, 13(7): 1134-1144.
[14]	寇晓宇，吕天舒，张岩. 结合深度学习的网络邻居结构研究及应用[J]. 计算机科学与探索, 2019, 13(2): 239-250.
[15]	杨帅，胡学钢，张玉红. 用于域适应的多边缘降噪自动编码器[J]. 计算机科学与探索, 2019, 13(2): 322-329.