计算机科学与探索 ›› 2019, Vol. 13 ›› Issue (10): 1733-1744.DOI: 10.3778/j.issn.1673-9418.1903052

• 人工智能与模式识别 • 上一篇    下一篇

利用变分自编码器进行网络表示学习

张蕾,钱峰,赵姝,陈洁,张燕平   

  1. 1. 安徽大学 计算机科学与技术学院,合肥 230601
    2. 铜陵学院 数学与计算机学院,安徽 铜陵 244061
  • 出版日期:2019-10-01 发布日期:2019-10-15

Network Representation Learning via Variational Auto-Encoder

ZHANG Lei, QIAN Feng, ZHAO Shu, CHEN Jie, ZHANG Yanping   

  1. 1. School of Computer Science and Technology, Anhui University, Hefei 230601, China
    2. School of Mathematics and Computer Science, Tongling University, Tongling, Anhui 244061, China
  • Online:2019-10-01 Published:2019-10-15

摘要: 网络表示学习的目标是将网络节点映射到一个低维的向量空间中,然后利用已有的机器学习方法解决诸如节点分类、链接预测、社团挖掘和推荐等下游应用任务。通常网络中的节点携有属性信息,与结构信息具有一定的相关性,将这些信息融入到网络表示学习过程中,有助于提升下游任务的性能。但是针对不同的应用场景,结构和属性信息并不总是线性相关,而且它们都是高度非线性的数据。提出一种基于变分自编码器的网络表示学习方法VANRL。变分自编码器是一种深度神经网络,它不仅可以捕获结构和属性非线性相似性,还可以学习到数据的分布。针对不同的应用任务,灵活地组合结构信息和属性信息,使学习到的网络节点表示达到令人满意的性能。在四个网络(包括两个社交网络,两个引用网络)上的实验结果表明,VANRL可以在节点分类和链路预测任务中获得相对显著的效果。

关键词: 网络表示学习, 拓扑结构, 节点属性, 变分自编码器

Abstract: Network representation learning aims to represent network nodes into a low-dimensional vector space, so that some downstream application tasks such as node classification, link prediction, community detection and recom-mendation can be easily performed by using simple machine learning algorithm. The network nodes usually have attribute information, which is related to the structure information to a certain extent. It is helpful to improve the performance of the downstream tasks by incorporating these information into the learning process of the network representation. However, for different application scenarios, structure information and attribute information are not always linearly correlated, and they are highly nonlinear data. In this paper, variational auto-encoder based network representation learning algorithm is proposed, called VANRL. Variational auto-encoder is a kind of deep neural network, which can not only capture the nonlinear similarity of structure and attribute, but also learn about the distribution of data. In addition, for different application tasks, the flexible fusion of structural information and attribute information makes the learned representation of network nodes achieve satisfactory performance. On four networks, including two social networks and two reference networks, the experimental results show that VANRL can achieve relatively significant results in the tasks of node classification and link prediction.

Key words: network representation learning, topology structure, node attribute, variational auto-encoder