计算机科学与探索 ›› 2014, Vol. 8 ›› Issue (5): 608-613.DOI: 10.3778/j.issn.1673-9418.1306025

• 人工智能与模式识别 • 上一篇    下一篇

计算文本的情感描述值的算法

齐保元1,2+,史忠植1   

  1. 1. 中国科学院 计算技术研究所 智能信息处理重点实验室,北京 100190
    2. 中国科学院大学,北京 100039
  • 出版日期:2014-05-01 发布日期:2014-05-05

An Algorithm for Computing Sentiment Description Value of Text

QI Baoyuan1,2+, SHI Zhongzhi1   

  1. 1. Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
    2. University of Chinese Academy of Sciences, Beijing 100039, China
  • Online:2014-05-01 Published:2014-05-05

摘要: 随着互联网信息的高速发展,越来越多的人参与到信息的制造者队伍中,对于信息处理提出了更高的要求。计算文本的情感描述值对于衡量文本的极性信息具有重要的意义。首先对文本内容进行预处理,挑选出可以决定文本极性的句子;然后对各个子句进行情感描述值的计算;最后将子句的情感进行综合计算,得出文本的情感描述值。并且对文本长度、句法结构等因素进行了综合分析。实验结果表明,采用该算法计算文本信息具有较高的准确率和速度,对于大规模处理流数据情况下的情感信息值的计算具有较好的适用性。

关键词: 情感描述值, 情感分类, 流数据

Abstract: Along with the rapid development of Internet information, more and more people join in as information maker and this has made a higher requirement for information processing. It’s important to calculate the sentiment description value of the text for evaluating the polarity of the text. Firstly, this paper preprocesses the text to choose the sentences that may best determine the polarity. Secondly, it calculates the sentiment description value of each sub-sentence. Finally, the total value of the combined results is got. This paper also conducts a comprehensive analysis of the elements that may influence the results, such as the length of the text, syntax structure, etc. The experimental results show that the proposed algorithm gets higher accuracy and higher speed, and has better applicability in the situation of large scale stream data environment.

Key words: sentiment description value, sentiment classification, stream data