计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (11): 2505-2518.DOI: 10.3778/j.issn.1673-9418.2104075
收稿日期:
2021-04-02
修回日期:
2021-05-17
出版日期:
2022-11-01
发布日期:
2021-05-25
通讯作者:
+ E-mail: guyijun@ppsuc.edu.cn作者简介:
王本钰(1998—),男,江苏盐城人,硕士研究生,主要研究方向为复杂网络、深度学习。基金资助:
WANG Benyu, GU Yijun+(), PENG Shufan
Received:
2021-04-02
Revised:
2021-05-17
Online:
2022-11-01
Published:
2021-05-25
About author:
WANG Benyu, born in 1998, M.S. candidate. His research interests include complex network and deep learning.Supported by:
摘要:
网络嵌入的目标是学习网络中节点的低维特征表示,将学习到的特征用于网络的各种分析任务中,例如节点分类、链路预测、社区发现和推荐等。现有的网络嵌入方法对于社交网络中高阶结构信息利用不足并且没有考虑社交网络结构信息和属性信息的相关性,应用于社交网络中效果并不理想。为了解决这些问题,提出了一种融合节点属性和无环路径的社交网络嵌入方法(LFNE)。该算法首先基于节点间无环路径计算节点高阶结构相似性,消除环状路径和大度节点对于节点结构相似性的影响,使得网络嵌入方法可以更好地融合社交网络高阶结构信息。然后结合节点间无环路径相似性度量指标计算节点属性相似性,充分利用社交网络结构信息和属性信息的相关性,消除属性信息中存在的噪音。最后融合节点结构相似性和属性相似性应用于堆叠降噪自动编码器中学习节点的低维特征表示。在3个社交网络数据集上与近几年代表性算法进行实验对比,实验结果表明LFNE算法在节点分类和链路预测实验中可以取得相对显著的效果,具有更好的网络嵌入表现。
中图分类号:
王本钰, 顾益军, 彭舒凡. 融合节点属性和无环路径的社交网络嵌入方法[J]. 计算机科学与探索, 2022, 16(11): 2505-2518.
WANG Benyu, GU Yijun, PENG Shufan. Social Network Embedding Method Combining Node Attributes and Loop-Free Path[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(11): 2505-2518.
Datasets | Nodes | Edges | Attributes | Label |
---|---|---|---|---|
BlogCatalog | 5 196 | 171 743 | 8 189 | 6 |
Flickr | 7 575 | 239 738 | 12 047 | 9 |
1 005 | 25 751 | 42 | — |
表1 真实网络数据集基本信息
Table 1 Basic information of real network datasets
Datasets | Nodes | Edges | Attributes | Label |
---|---|---|---|---|
BlogCatalog | 5 196 | 171 743 | 8 189 | 6 |
Flickr | 7 575 | 239 738 | 12 047 | 9 |
1 005 | 25 751 | 42 | — |
Datasets | Structure of autoencoder |
---|---|
BlogCatalog | 5 196-1 024-256 |
Flickr | 7 375-2 048-256 |
Email-Eu-core | 1 005-512-64 |
表2 不同数据集下堆叠降噪自动编码器结构
Table 2 Structure of stacked denoising autoencoder under different datasets
Datasets | Structure of autoencoder |
---|---|
BlogCatalog | 5 196-1 024-256 |
Flickr | 7 375-2 048-256 |
Email-Eu-core | 1 005-512-64 |
Model | Ratio of training sample | ||||
---|---|---|---|---|---|
10% | 30% | 50% | 70% | 90% | |
DeepWalk | 0.583 5 | 0.653 1 | 0.674 3 | 0.692 4 | 0.683 4 |
node2vec | 0.597 1 | 0.642 3 | 0.670 1 | 0.681 7 | 0.676 6 |
DNGR | 0.666 3 | 0.674 5 | 0.702 5 | 0.707 7 | 0.712 3 |
SDNE | 0.674 3 | 0.714 2 | 0.735 6 | 0.729 5 | 0.738 3 |
TADW | 0.796 3 | 0.829 1 | 0.832 1 | 0.821 4 | 0.865 3 |
DANE | 0.785 2 | 0.833 4 | 0.848 1 | 0.836 6 | 0.873 3 |
LFNE | 0.836 4 | 0.878 8 | 0.887 5 | 0.869 3 | 0.873 1 |
表3 BlogCatalog数据集节点分类Micro-F1分数
Table 3 Micro-F1 of node classification on BlogCatalog dataset
Model | Ratio of training sample | ||||
---|---|---|---|---|---|
10% | 30% | 50% | 70% | 90% | |
DeepWalk | 0.583 5 | 0.653 1 | 0.674 3 | 0.692 4 | 0.683 4 |
node2vec | 0.597 1 | 0.642 3 | 0.670 1 | 0.681 7 | 0.676 6 |
DNGR | 0.666 3 | 0.674 5 | 0.702 5 | 0.707 7 | 0.712 3 |
SDNE | 0.674 3 | 0.714 2 | 0.735 6 | 0.729 5 | 0.738 3 |
TADW | 0.796 3 | 0.829 1 | 0.832 1 | 0.821 4 | 0.865 3 |
DANE | 0.785 2 | 0.833 4 | 0.848 1 | 0.836 6 | 0.873 3 |
LFNE | 0.836 4 | 0.878 8 | 0.887 5 | 0.869 3 | 0.873 1 |
Model | Ratio of training sample | ||||
---|---|---|---|---|---|
10% | 30% | 50% | 70% | 90% | |
DeepWalk | 0.576 5 | 0.651 3 | 0.671 1 | 0.684 8 | 0.675 4 |
node2vec | 0.587 3 | 0.637 4 | 0.664 2 | 0.685 6 | 0.680 1 |
DNGR | 0.669 6 | 0.671 2 | 0.693 4 | 0.701 2 | 0.718 8 |
SDNE | 0.664 5 | 0.707 6 | 0.729 6 | 0.721 5 | 0.734 5 |
TADW | 0.799 6 | 0.814 1 | 0.823 1 | 0.810 6 | 0.857 4 |
DANE | 0.778 7 | 0.826 9 | 0.831 4 | 0.831 9 | 0.856 5 |
LFNE | 0.829 6 | 0.870 1 | 0.883 1 | 0.860 1 | 0.867 3 |
表4 BlogCatalog数据集节点分类Macro-F1分数
Table 4 Macro-F1 of node classification on BlogCatalog dataset
Model | Ratio of training sample | ||||
---|---|---|---|---|---|
10% | 30% | 50% | 70% | 90% | |
DeepWalk | 0.576 5 | 0.651 3 | 0.671 1 | 0.684 8 | 0.675 4 |
node2vec | 0.587 3 | 0.637 4 | 0.664 2 | 0.685 6 | 0.680 1 |
DNGR | 0.669 6 | 0.671 2 | 0.693 4 | 0.701 2 | 0.718 8 |
SDNE | 0.664 5 | 0.707 6 | 0.729 6 | 0.721 5 | 0.734 5 |
TADW | 0.799 6 | 0.814 1 | 0.823 1 | 0.810 6 | 0.857 4 |
DANE | 0.778 7 | 0.826 9 | 0.831 4 | 0.831 9 | 0.856 5 |
LFNE | 0.829 6 | 0.870 1 | 0.883 1 | 0.860 1 | 0.867 3 |
Model | Ratio of training sample | ||||
---|---|---|---|---|---|
10% | 30% | 50% | 70% | 90% | |
DeepWalk | 0.433 1 | 0.502 1 | 0.513 3 | 0.519 6 | 0.524 3 |
node2vec | 0.418 5 | 0.479 5 | 0.510 7 | 0.518 4 | 0.525 5 |
DNGR | 0.476 3 | 0.521 2 | 0.571 4 | 0.596 3 | 0.612 2 |
SDNE | 0.501 4 | 0.541 7 | 0.568 4 | 0.594 7 | 0.603 3 |
TADW | 0.645 3 | 0.694 5 | 0.701 6 | 0.721 3 | 0.731 1 |
DANE | 0.654 8 | 0.706 3 | 0.719 4 | 0.734 5 | 0.741 4 |
LFNE | 0.713 6 | 0.726 7 | 0.736 6 | 0.752 1 | 0.770 5 |
表5 Flickr数据集节点分类Micro-F1分数
Table 5 Micro-F1 of node classification on Flickr dataset
Model | Ratio of training sample | ||||
---|---|---|---|---|---|
10% | 30% | 50% | 70% | 90% | |
DeepWalk | 0.433 1 | 0.502 1 | 0.513 3 | 0.519 6 | 0.524 3 |
node2vec | 0.418 5 | 0.479 5 | 0.510 7 | 0.518 4 | 0.525 5 |
DNGR | 0.476 3 | 0.521 2 | 0.571 4 | 0.596 3 | 0.612 2 |
SDNE | 0.501 4 | 0.541 7 | 0.568 4 | 0.594 7 | 0.603 3 |
TADW | 0.645 3 | 0.694 5 | 0.701 6 | 0.721 3 | 0.731 1 |
DANE | 0.654 8 | 0.706 3 | 0.719 4 | 0.734 5 | 0.741 4 |
LFNE | 0.713 6 | 0.726 7 | 0.736 6 | 0.752 1 | 0.770 5 |
Model | Ratio of training sample | ||||
---|---|---|---|---|---|
10% | 30% | 50% | 70% | 90% | |
DeepWalk | 0.431 4 | 0.497 4 | 0.508 5 | 0.516 6 | 0.521 4 |
node2vec | 0.414 7 | 0.473 4 | 0.506 6 | 0.512 1 | 0.521 7 |
DNGR | 0.471 1 | 0.516 7 | 0.564 1 | 0.587 4 | 0.604 5 |
SDNE | 0.497 1 | 0.536 4 | 0.561 4 | 0.587 6 | 0.594 1 |
TADW | 0.638 4 | 0.681 4 | 0.707 7 | 0.711 5 | 0.724 5 |
DANE | 0.646 3 | 0.692 1 | 0.714 5 | 0.734 7 | 0.748 4 |
LFNE | 0.708 5 | 0.713 4 | 0.733 1 | 0.752 4 | 0.766 3 |
表6 Flickr数据集节点分类Macro-F1分数
Table 6 Macro-F1 of node classification on Flickr dataset
Model | Ratio of training sample | ||||
---|---|---|---|---|---|
10% | 30% | 50% | 70% | 90% | |
DeepWalk | 0.431 4 | 0.497 4 | 0.508 5 | 0.516 6 | 0.521 4 |
node2vec | 0.414 7 | 0.473 4 | 0.506 6 | 0.512 1 | 0.521 7 |
DNGR | 0.471 1 | 0.516 7 | 0.564 1 | 0.587 4 | 0.604 5 |
SDNE | 0.497 1 | 0.536 4 | 0.561 4 | 0.587 6 | 0.594 1 |
TADW | 0.638 4 | 0.681 4 | 0.707 7 | 0.711 5 | 0.724 5 |
DANE | 0.646 3 | 0.692 1 | 0.714 5 | 0.734 7 | 0.748 4 |
LFNE | 0.708 5 | 0.713 4 | 0.733 1 | 0.752 4 | 0.766 3 |
[1] |
TU C, YANG C, LIU Z, et al. Network representation learning: an overview[J]. Scientia Sinica Informationis, 2017, 47(8): 980-996.
DOI URL |
[2] | PEROZZI B, AL-RFOU R, SKIENA S. DeepWalk: online learning of social representations[C]// Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, Aug 24-27, 2014. New York: ACM, 2014: 701-710. |
[3] | MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[J]. arXiv:1310.4546, 2013. |
[4] | TANG J, QU M, WANG M, et al. Line: large-scale information network embedding[C]// Proceedings of the 24th International Conference on World Wide Web, Florence, May 18-22, 2015. New York: ACM, 2015: 1067-1077. |
[5] | CAO S, LU W, XU Q. GraRep: learning graph representations with global structural information[C]// Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, Melbourne, Oct 19-23, 2015. New York: ACM, 2015: 891-900. |
[6] | GROVER A, LESKOVEC J. Node2vec: scalable feature lear-ning for networks[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, Aug 24-27, 2016. New York: ACM, 2016: 855-864. |
[7] | CAO S, LU W, XU Q. Deep neural networks for learning graph representations[C]// Proceedings of the 2016 AAAI Conference on Artificial Intelligence, Phoenix, Feb 12-17, 2016. Menlo Park: AAAI, 2016: 1145-1152. |
[8] | WANG D, CUI P, ZHU W. Structural deep network embedding[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, Aug 13-17, 2016. New York: ACM, 2016: 1225-1234. |
[9] | RIBEIRO L F R, SAVERESE P H P, FIGUEIREDO D R. Struc2vec: learning node representations from structural identity[C]// Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, Aug 13-17, 2017. New York: ACM, 2017: 385-394. |
[10] | WANG C, WANG C, WANG Z, et al. Edge2vec: edge-based social network embedding[J]. ACM Transactions on Knowledge Discovery from Data, 2020, 14(4): 1-24. |
[11] | YANG C, LIU Z, ZHAO D, et al. Network representation learning with rich text information[C]// Proceedings of the 24th International Joint Conference on Artificial Intelligence, Buenos Aires, Jul 25-31, 2015. Menlo Park: AAAI, 2015: 2111-2117. |
[12] | TU C, LIU H, LIU Z, et al. CANE: context-aware network embedding for relation modeling[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Jul 30-Aug 4, 2017. Stroudsburg: ACL, 2017: 1722-1731. |
[13] | HUANG X, LI J, HU X. Accelerated attributed network embedding[C]// Proceedings of the 2017 SIAM International Conference on Data Mining, Houston, Apr 27-29, 2017. Philadelphia: SIAM, 2017: 633-641. |
[14] | BANDYOPADHYAY S, LOKESH N, MURTY M N. Outlier aware network embedding for attributed networks[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence, the 31st Innovative Applications of Artificial Intelligence Conference, the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, Honolulu, Jan 27-Feb 1, 2019. Menlo Park: AAAI, 2019: 12-19. |
[15] |
HONG R C, HE Y, WU L, et al. Deep attributed network embedding by preserving structure and attribute information[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2021, 51(3): 1434-1445.
DOI URL |
[16] |
NOZZA D, FERSINI E, MESSINA E. CAGE: constrained deep attributed graph embedding[J]. Information Sciences, 2020, 518: 56-70.
DOI URL |
[17] | NEWMAN M E J. Clustering and preferential attachment in growing networks[J]. Physical Review E, 2001, 64(2): 025102. |
[18] | CHOWDHURY G G. Introduction to modern information retrieval[M]. London: Facet Publishing, 2010. |
[19] |
RAVASZ E, SOMERA A L, MONGRU D A, et al. Hierarchical organization of modularity in metabolic networks[J]. Science, 2002, 297(5586): 1551-1555.
DOI URL |
[20] | LEICHT E A, HOLME P, NEWMAN M E J. Vertex similarity in networks[J]. Physical Review E, 2006, 73(2): 026120. |
[21] |
KATZ L. A new status index derived from sociometric analysis[J]. Psychometrika, 1953, 18(1): 39-43.
DOI URL |
[22] |
ZHOU T, LÜ L, ZHANG Y C. Predicting missing links via local information[J]. The European Physical Journal B, 2009, 71(4): 623-630.
DOI URL |
[23] |
CHEN Z, XIE Z, ZHANG Q. Community detection based on local topological information and its application in power grid[J]. Neurocomputing, 2015, 170: 384-392.
DOI URL |
[24] |
WU J, CHENG N, ZHOU C, et al. Computing the number of loop-free k-hop paths of networks[J]. IEEE Transactions on Services Computing, 2022, 15(4): 2114-2128.
DOI URL |
[25] | LV L Y, JIN C H, ZHOU T. Similarity index based on local paths for link prediction of complex networks[J]. Physical Review E, 2009, 80(4): 046122. |
[26] | 唐晋韬, 王挺, 王戟. 适合复杂网络分析的最短路径近似算法[J]. 软件学报, 2011, 22(10): 2279-2290. |
TANG J T, WANG T, WANG J. Shortest path approximate algorithm for complex network analysis[J]. Journal of Software, 2011, 22(10): 2279-2290.
DOI URL |
|
[27] | LE GALL F. Powers of tensors and fast matrix multiplication[C]// Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation, Kobe, Jul 23-25, 2014. New York: ACM, 2014: 296-303. |
[28] | HUANG X, LI J, HU X. Label informed attributed network embedding[C]// Proceedings of the 10th ACM International Conference on Web Search and Data Mining, Cambridge, Feb 6-10, 2017. New York: ACM, 2017: 731-739. |
[29] | GAO H, HUANG H. Self-paced network embedding[C]// Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, London, Aug 19-23, 2018. New York: ACM, 2018: 1406-1415. |
[30] |
FAWCETT T. An introduction to ROC analysis[J]. Pattern Recognition Letters, 2006, 27(8): 861-874.
DOI URL |
[1] | 王雪纯, 吕晟凯, 吴浩, 何鹏, 曾诚. 多网络混合嵌入学习的服务推荐方法研究[J]. 计算机科学与探索, 2022, 16(7): 1529-1542. |
[2] | 陈江美, 张文德. 基于位置社交网络的兴趣点推荐系统研究综述[J]. 计算机科学与探索, 2022, 16(7): 1462-1478. |
[3] | 杨书新, 宋建缤, 梁文. 基于割点的社交网络影响最大化问题[J]. 计算机科学与探索, 2022, 16(6): 1316-1326. |
[4] | 陈洁, 陈嘉琳, 赵姝, 张燕平. 层次标签引导的属性网络嵌入[J]. 计算机科学与探索, 2021, 15(7): 1279-1288. |
[5] | 端祥宇, 袁冠, 孟凡荣. 动态社区发现方法研究综述[J]. 计算机科学与探索, 2021, 15(4): 612-630. |
[6] | 高昂, 梁英, 谢小杰, 王梓森, 李锦涛. 支持隐私保护的社交网络信息传播方法[J]. 计算机科学与探索, 2021, 15(2): 233-248. |
[7] | 郭奕,徐亮,熊雪军. 社交网络中意见领袖挖掘方法综述[J]. 计算机科学与探索, 2021, 15(11): 2077-2092. |
[8] | 李娜,朱怀杰,刘威,印鉴. 基于位置的稀疏群体查询[J]. 计算机科学与探索, 2021, 15(11): 2151-2160. |
[9] | 李想,申德荣,冯朔,寇月,聂铁铮. 结合全局种子最优局部扩展的跨网络用户识别[J]. 计算机科学与探索, 2020, 14(6): 928-938. |
[10] | 张卜聆,王兴伟,李婕,易波,黄敏. 基于朋友圈和节点感知的内容中心MSN路由机制[J]. 计算机科学与探索, 2020, 14(1): 51-58. |
[11] | 王瑶,寇月,申德荣,聂铁铮,于戈. 元路径选择和矩阵分解的跨社交网络链路预测[J]. 计算机科学与探索, 2019, 13(9): 1459-1470. |
[12] | 张昱,高克宁,陈默,于戈. 融合网络结构和节点属性的链接预测方法[J]. 计算机科学与探索, 2019, 13(7): 1094-1101. |
[13] | 江淼淼,孙更新,宾晟. 多关系社交网络中社团结构发现算法[J]. 计算机科学与探索, 2019, 13(7): 1134-1144. |
[14] | 寇晓宇,吕天舒,张岩. 结合深度学习的网络邻居结构研究及应用[J]. 计算机科学与探索, 2019, 13(2): 239-250. |
[15] | 杨帅,胡学钢,张玉红. 用于域适应的多边缘降噪自动编码器[J]. 计算机科学与探索, 2019, 13(2): 322-329. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||