计算机科学与探索 ›› 2019, Vol. 13 ›› Issue (8): 1261-1271.DOI: 10.3778/j.issn.1673-9418.1807041

• 学术研究 • 上一篇    下一篇

保持Motif结构的网络表示学习

许磊,黄玲,王昌栋   

  1. 中山大学 数据科学与计算机学院,广州 510000
  • 出版日期:2019-08-01 发布日期:2019-08-07

Motif-Preserving Network Representation Learning

XU Lei, HUANG Ling, WANG Changdong   

  1. School of Data and Computer Science, Sun Yat-Sen University, Guangzhou 510000, China
  • Online:2019-08-01 Published:2019-08-07

摘要: 随着信息技术的广泛应用,网络在人们日常的生活中变得无处不在。网络表示学习算法是最近研究网络的一个热门领域,它旨在保留网络拓扑结构信息的同时,将网络映射到一个潜在、低维度的向量空间。网络Motif,在网络分析中具有重要的意义,然而之前提出的网络表示学习算法绝大多数只考虑了节点的邻域属性或邻近性,而忽略了节点的Motif结构信息。因此,基于上述考虑,提出了算法“保持Motif结构的网络表示学习”,使得在学习网络节点向量表示时能够更加侧重地考虑网络Motif的结构。算法首先计算出基于Motif的网络权重矩阵;接着求得网络中每个节点的基于Motif的个性化PageRank预估值;最后进行MotifWalk得到游走路径,从而能够运用Word2Vec模型来得到网络的向量表示。通过与三个经典的网络表示算法比较,发现在稠密以及Motif结构丰富的网络中,提出的算法表现得更好。

关键词: 网络表示学习, 随机游走, Motif结构

Abstract: With the widespread use of information technologies, networks are becoming ubiquitous in real-word applications. Network representation learning (also called network embedding) has been attracting increasing attention in the recent years, which aims to learn latent, low-dimensional representations of network vertices, while preserving network topological structure. Network Motif structure is of key importance in network analysis. Most of the earlier works related to network representation learning only consider the node's neighborhood attributes or proximity, while ignoring the Motif information of nodes. Therefore, based on the above consideration, this paper proposes the Motif-preserving network embedding (MPNE) algorithm, which focuses more on Motif structure when learning vertex representations. The proposed algorithm firstly constructs a weighted network based on the given Motif structure and computes the approximate personalized PageRank vector (APPR) for the weighted network subsequently. Finally, the algorithm performs MotifWalk on the graph to get the path so that the Word2Vec model can be applied to obtain the vector representation of the network. Compared with three classical network representation algorithms, the experimental results show that the proposed algorithm is better in dense and Motif-rich networks.

Key words: network representation learning, random walk, Motif