计算机科学与探索 ›› 2015, Vol. 9 ›› Issue (3): 279-291.DOI: 10.3778/j.issn.1673-9418.1411046

• 学术研究 • 上一篇    下一篇

基于低秩和局部约束矩阵估计的链接预测方法

刘  冶1,印  鉴1+,邓泽亚1,王智圣1,潘  炎2   

  1. 1. 中山大学 信息科学与技术学院,广州 510006
    2. 中山大学 软件学院,广州 510006
  • 出版日期:2015-03-01 发布日期:2015-03-09

Link Prediction with Low-Rank and Local Constraint Matrix Estimation

LIU Ye1, YIN Jian1+, DENG Zeya1, WANG Zhisheng1, PAN Yan2   

  1. 1. School of Information Science and Technology, Sun Yat-sen University, Guangzhou 510006, China
    2. School of Software, Sun Yat-sen University, Guangzhou 510006, China
  • Online:2015-03-01 Published:2015-03-09

摘要: 在大数据时代,互联网社会网络和其他复杂网络中的链接预测问题研究成为热门领域。链接预测相关的方法已被广泛地应用于社会网络关系挖掘、个性化推荐和生物制药等领域。在链接预测问题中,通常使用相似性矩阵来表示网络中任意节点之间存在链接的可能性,因此相似性矩阵的计算是链接预测中至关重要的一步。近年来的研究中,大多数方法是基于已知网络中数据的分析,通过网络潜在结构设计机器学习算法构造相似性矩阵。在全局低秩的网络结构假设下,结合网络中节点特征的局部约束,提出了一种基于数据的链接预测优化算法,并针对复杂网络数据链接预测问题设计了可扩展的分治方法,便于分布式环境中对大规模数据进行求解。通过在多个真实数据集上的实验和结果分析,基于低秩结构和局部约束矩阵估计的链接预测分治方法能够取得较好的效果,并对复杂的网络结构数据具有较强的可扩展性。

关键词: 链接预测, 低秩估计, 局部约束, 社会网络, 数据挖掘

Abstract: In the big data era, the research on link prediction for social network on the Web and other complex networks has attracted popular interest. Some methods with link prediction have been widely used in social network relationship mining, individual recommendation and biological pharmacy. In the link prediction problem, the similarity matrix is selected for representing the probability of existed links between any nodes. Therefore, the calculation method for estimating the similarity matrix becomes the most crucial step. Recent years, most research focuses on the methods based on data analysis which construct the similarity relationship matrix by data-dependent machine learning and optimization algorithm. This paper proposes a novel data-dependent link prediction method with global low-rank structure assumption and local constraint of node features in the network. The new proposed method is designed for scalable divide-and-conquer calculation for complex network and suitable for distributed computation. Extensive experiments on several real-world datasets show that the proposed link prediction measure obtains competitive performance compared with the baselines. The results also indicate the new algorithm is effective, robust and scalable for complex networks.

Key words: link prediction, low-rank estimation, local constraint, social network, data mining