计算机科学与探索 ›› 2017, Vol. 11 ›› Issue (1): 1-23.DOI: 10.3778/j.issn.1673-9418.1601037

• 综述·探索 • 上一篇    下一篇

社会媒体大数据分析研究综述

杜治娟+,王  硕,王秋月,孟小峰   

  1. 中国人民大学 信息学院,北京 100872
  • 出版日期:2017-01-01 发布日期:2017-01-10

Survey on Social Media Big Data Analytics

DU Zhijuan+, WANG Shuo, WANG Qiuyue, MENG Xiaofeng   

  1. School of Information, Renmin University of China, Beijing 100872, China
  • Online:2017-01-01 Published:2017-01-10

摘要: 社会媒体作为人们传播信息和表达观点的重要渠道,包含大量丰富的有用信息,近年来已成为大数据最具代表性的数据来源之一,挖掘与分析这些信息对社会发展影响深远。按照社交媒体的构成要素将目前研究划分为3类,即从基于用户的分析、基于关系的分析和基于交互内容的分析三方面进行总结分析。首先,从多源异构网络中识别用户身份,发现社群并计算用户影响力来分析基于用户的数据;其次,从用户关系强度计算、信息传播和影响力最大化3个角度探讨了基于交互关系为中心的数据分析;然后,基于用户交互内容探讨了特征提取与选择、话题事件挖掘、多媒体数据分析以及情感分析4个问题。最后,从信息传播、影响力计算、特征提取与选择、微博新闻挖掘、社会媒体大数据融合和跨语言情感分析6个方面指出了现有研究的挑战性和未来研究的新视角。

关键词: 社交媒体, 大数据, 用户行为, 交互关系, 交互内容

Abstract: Social media, which consists of a large number of meaningful information, is an important way for people to propagate information and express themselves. In recent years, it has become one of the most representative sources of big data. Mining and analyzing the information has profound impact on social development. According to the elements of social media, the current researches are divided into three categories, including analysis based on users, analysis based on relationships and analysis based on interactive contents. Firstly, analyzing user-centered data from user identification based multi-source heterogeneous network, community detection and user influence computing. Secondly, analyzing user relationship strength calculation, information diffusion and influence maximization issues based on interactive relationship-center. Thirdly, discussing feature extraction and selection, the topic or event mining, multimedia data analysis and sentiment analysis issues based on user interactive content analyzing interactive       content-centric. Finally, this paper elaborates challenges of mining big data of social media and points out the future work from information diffusion, influence computing, feature extraction and selection, news mining based on      Microblog, social media big data fusion and cross-lingual sentiment analysis 6 aspects.

Key words:  social media, big data, user behavior, interactive relationship, interactive content