计算机科学与探索 ›› 2013, Vol. 7 ›› Issue (8): 718-728.DOI: 10.3778/j.issn.1673-9418.1305046

• 学术研究 • 上一篇    下一篇

ACT-LDA:集成话题、社区和影响力分析的概率模型

吴  良1,2,黄威靖1,2,陈  薇1,2,3+,王腾蛟1,2,3,雷  凯3,刘月琴4   

  1. 1. 高可信软件技术教育部重点实验室,北京 100871
    2. 北京大学 信息科学技术学院,北京 100871
    3. 北京大学 深圳研究生院 深圳市云计算关键技术与应用重点实验室,广东 深圳 518055
    4. 国际关系学院 信息科技系,北京 100091
  • 出版日期:2013-08-01 发布日期:2013-08-06

ACT-LDA: A Probabilistic Model of Topic, Community and User Influence

WU Liang1,2, HUANG Weijing1,2, CHEN Wei1,2,3+, WANG Tengjiao1,2,3, LEI Kai3, LIU Yueqin4   

  1. 1. Key Laboratory of High Confidence Software Technologies, Ministry of Education, Beijing 100871, China
    2. School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China
    3. The Shenzhen Key Lab for Cloud Computing Technology and Applications, Shenzhen Graduate School, Peking University, Shenzhen, Guangdong 518055, China
    4. Department of Information Science and Technology, University of International Relations, Beijing 100091, China
  • Online:2013-08-01 Published:2013-08-06

摘要: 随着社交网络的发展,社交网络中的用户形成大规模的用户关系图,用户在社交网络中发表内容,这些内容及其链接关系形成大规模的文档图。如何根据用户关系图、文档图,挖掘出用户所形成的社区、社区用户的影响力以及各个社区的话题,是重要的问题,而目前这些工作相对独立。考虑了用户发表内容、用户之间的关系信息,利用话题传播、社区形成和用户影响力之间的关联性,提出了一个基于LDA(latent Dirichlet allocation)的集成话题发现、社区发现和用户影响力分析的统一模型ACT-LDA(author-community-topic LDA)。模型采用变分推理的方法解决推理问题。在DBLP数据上进行了实验,取得了非常好的结果,证明了模型的有效性。

关键词: 社交网络, 社区发现, 话题模型, 用户影响力

Abstract: With the development of social network, users in all kinds of social networks form a large user graph. The articles published by users and the links between articles form a large document graph. According to the user graph and document graph, how to mine the topics, communities and user influence is an important problem, but now this problem is handled independently. Considering the inter-dependence of the problem and making use of text content and coauthor relationship, this paper proposes a joint model of topic modeling, community discovery and user influence analysis based on LDA (latent Dirichlet allocation), called ACT-LDA (author-community-topic LDA), which uses variation method for inference. The experiments on DBLP data show very good results, which validates the proposed model.

Key words: social network, community discovery, topic modeling, user influence