计算机科学与探索 ›› 2015, Vol. 9 ›› Issue (9): 1100-1107.DOI: 10.3778/j.issn.1673-9418.1410013

• 网络与信息安全 • 上一篇    下一篇

微博社区快速发现方法

刘  超1,徐雅斌1,2+,武  装1   

  1. 1. 北京信息科技大学 计算机学院,北京 100101
    2. 网络文化与数字传播北京市重点实验室,北京 100101
  • 出版日期:2015-09-01 发布日期:2015-12-11

Method for Rapid Detecting Micro-Blog Communities

LIU Chao1, XU Yabin1,2+, WU Zhuang1   

  1. 1. School of Computer, Beijing Information Science & Technology University, Beijing 100101, China
    2. Beijing Key Laboratory of Internet Culture and Digital Dissemination Research, Beijing 100101, China
  • Online:2015-09-01 Published:2015-12-11

摘要: 微博社区发现在舆情分析、个性化推荐等方面具有重要的应用价值。为了准确而高效地发现微博社交网络中的社区,提出了一种基于连边层次聚类的微博社区发现方法。该方法通过高度重叠社区的合并及划分误差的修正,进一步提高了微博社区发现的准确率。为了提高微博社区发现的效率,利用开源云计算平台Hadoop所提供的MapReduce编程模型进行了分布式并行处理。实验结果表明,所采用的微博社区发现方法不仅具有较高的准确率,而且具有较高的效率。

关键词: 微博, 微博社区发现, 连边层次聚类, MapReduce

Abstract: Micro-blog community detection has an important application value in public opinion analysis and personalized recommendation, etc. In order to find communities from micro-blog social network accurately and efficiently, this paper proposes a method of micro-blog community detection based on hierarchical clustering of edge. This method further improves the accuracy of community detection through merging highly overlapping communities and correcting divided error. To improve the efficiency of micro-blog community detection, this paper carries out distributed parallel processing with MapReduce model provided by open source cloud computing platform—Hadoop. The experimental results show that the micro-blog community detection method has higher accuracy and efficiency.

Key words: micro-blog, micro-blog community detection, edge hierarchical clustering, MapReduce