Method on Building Chinese Text Sentiment Lexicon

doi:10.3778/j.issn.1673-9418.1305008

Journal of Frontiers of Computer Science and Technology ›› 2013, Vol. 7 ›› Issue (11): 1033-1039.DOI: 10.3778/j.issn.1673-9418.1305008

Previous Articles Next Articles

Method on Building Chinese Text Sentiment Lexicon

YANG Aimin1+, LIN Jianghao2, ZHOU Yongmei1

1. Cisco School of Informatics, Guangdong University of Foreign Studies, Guangzhou 510420, China
2. School of Management, Guangdong University of Foreign Studies, Guangzhou 510420, China

Online:2013-11-01 Published:2013-11-04

中文文本情感词典构建方法

阳爱民1+，林江豪2，周咏梅1

1. 广东外语外贸大学思科信息学院，广州 510420
2. 广东外语外贸大学国际工商管理学院，广州 510420

Abstract

Abstract: Massive Internet text sentiment analysis is currently a hot research topic. This paper describes a method on Chinese text sentiment lexicon construction. This method improves the pointwise mutual information (PMI) algorithm for computing the weights of general sentiment lexicon, by selecting several sentiment seed words and drawing upon the total result numbers from search engine. In order to examine the validity of this method, this paper uses the established sentiment lexicon for text sentiment, and compares the classification effects of the method based on sentiment lexicon with those of na?ve Bayesian classifier. The experimental results indicate that the high-quality sentiment lexicon can effectively choose and classify the sentiment characteristics, and has a stable classification function.

Key words: sentiment lexicon, sentiment classification, pointwise mutual information (PMI), naïve Bayes

摘要： 互联网海量文本的情感分析是当前的一个研究热点。介绍了一种中文文本情感词典构建方法，该方法选用若干个情感种子词，利用搜索引擎返回的共现数，通过改进的PMI（pointwise mutual information）算法计算情感词的情感权值。将构建的情感词典应用到文本情感分类实验中，在不同的语料环境下，对比基于情感词典和朴素贝叶斯分类器下的文本情感分类效果，实验结果表明，构建的情感词典，可有效用于情感特征选择和直接用于情感分类，并且分类性能稳定。

关键词: 情感词典, 情感分类, PMI算法, 朴素贝叶斯

YANG Aimin, LIN Jianghao, ZHOU Yongmei. Method on Building Chinese Text Sentiment Lexicon[J]. Journal of Frontiers of Computer Science and Technology, 2013, 7(11): 1033-1039.

阳爱民，林江豪，周咏梅. 中文文本情感词典构建方法[J]. 计算机科学与探索, 2013, 7(11): 1033-1039.

[1]	YANG Chen, SONG Xiaoning, SONG Wei. SentiBERT: Pre-training Language Model Combining Sentiment Information [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(9): 1563-1570.
[2]	ZHANG Zhoubin, XIANG Yan, LIANG Junge, YANG Jialin, MA Lei. Using Position-Enhanced Attention Mechanism for Aspect-Based Sentiment Classi-fication [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(4): 619-627.
[3]	YANG Shuai, HU Xuegang, ZHANG Yuhong. Multi-Marginalized Denoising Autoencoders for Domain Adaptation [J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(2): 322-329.
[4]	XING Yujuan, GUO Xian, TAN Ping, LI Ming. Text Sentiment Classification Based on Cloud Model Clustering and Mixed-Fisher Feature [J]. Journal of Frontiers of Computer Science and Technology, 2016, 10(9): 1320-1331.
[5]	QI Baoyuan, SHI Zhongzhi. An Algorithm for Computing Sentiment Description Value of Text [J]. Journal of Frontiers of Computer Science and Technology, 2014, 8(5): 608-613.
[6]	JIANG Kai, GAO Yang. A Parallelized Semi-Supervised Na?ve Bayes Classifier [J]. Journal of Frontiers of Computer Science and Technology, 2012, 6(10): 912-918.
[7]	NIU Gang, LUO Aibao, SHANG Lin. A survey of semi-supervised text categorization [J]. Journal of Frontiers of Computer Science and Technology, 2011, 5(4): 313-323.
[8]	WEI Zhisheng, JI Yangsheng, LUO Chunyong, CHEN Jiajun. Generative Sentiment Classification Model Affiliating Domain-Specific Senti-ment Lexicons [J]. Journal of Frontiers of Computer Science and Technology, 2011, 5(12): 1105-1113.
[9]	WANG Chishe^1,2+, CHENG Jiaxing¹, SU Shoubao¹, XU Dongzhe³. Identification of Interface Residues Involved in Protein-protein Interactions Using Naïve Bayes Classifier [J]. Journal of Frontiers of Computer Science and Technology, 2009, 3(3): 293-302.

Method on Building Chinese Text Sentiment Lexicon

中文文本情感词典构建方法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 9

Recommended Articles

Metrics