Journal of Frontiers of Computer Science and Technology ›› 2016, Vol. 10 ›› Issue (2): 194-200.DOI: 10.3778/j.issn.1673-9418.1506046

Previous Articles     Next Articles

Community Discovery Algorithm Based on Combination of Users Generated Contents and Link Relationships

ZHANG Ende+, GAO Kening, ZHANG Yu, LI Feng   

  1. Computing Center, Northeastern University, Shenyang 110819, China
  • Online:2016-02-01 Published:2016-02-03

结合用户生成内容与链接关系的社区发现算法

张恩德+,高克宁,张  昱,李  封   

  1. 东北大学 计算中心,沈阳 110819

Abstract: Community discovery has been an attractive field in social networks research. However, in current community discovery algorithms, more attention is attracted to the link relationships between users and little attention is paid on big data of user generated contents (UGC). User generated content is a special feature of Web2.0, and also is an important reason to attract users, which plays an important role to form communities. This paper presents a new algorithm to solve the community discovery problem, which comprehensively utilizes the link relationships between users and user generated contents. This algorithm uses latent Dirichlet allocation (LDA) algorithm to analyze text information generated by users, and uses spectral analysis method to analyze the link relationships between users, and combines them to discovery communities. By analyzing real world data — science network site data, the proposed algorithm is proved to effectively utilize the user generated contents and link relationships between users, and the results are more objective and accurate.

Key words: community discovery, user generated contents, user link relationships, social networks

摘要: 社区发现一直是社会网络研究中的热点内容。但是当前社区发现算法更加关注用户与用户之间的链接关系,而对社会网络中用户生成内容(user generated contents, UGC)大数据研究较少。用户生成内容是Web2.0的特点,也是社会网络平台吸引用户的重要原因之一,对社区的形成起着重要作用。提出了一种新的社区发现算法,能够综合利用用户与用户之间的链接关系以及用户生成内容来确定用户的社区划分。该算法用LDA(latent Dirichlet allocation)算法分析用户生成内容中主要的内容形式——文本信息,同时通过谱分析方法分析用户与用户之间的链接关系,并有机结合以发现网络的社区结构。通过分析科学网的真实数据,证明了所提算法能够有效综合利用用户生成内容与用户链接关系,使社区发现的结果更加客观准确。

关键词: 社区发现, 用户生成内容, 用户链接关系, 社会网络