Sentiment Polarity Discrimination Method Based on Topic Clustering

doi:10.3778/j.issn.1673-9418.1507044

Abstract

Abstract: Almost all state-of-art methods for sentiment analysis can hardly avoid extracting sentiment features and applying them to classifiers for detecting. However, with the characteristics of diversity expressions and scattered themes of network texts, it’s too difficult to extract more suitable and proper sentiment features. This paper proposes a novel algorithm to solve such problems. Firstly, original texts need to be clustered by topics with LDA (latent Dirichlet allocation) model. Then, for each topic dataset, language models are trained for positive and negative samples by using recurrent neural network. Finally, two kinds of probabilities of topic and sentiment are combined for evaluating text sentiment polarity. Through this method, this paper firstly standardizes text expression by dividing subcategories, limiting changes of words meaning under different topics, and then utilizes language model to avoid the difficulty of extracting features, making it possible to be internalized in the process of training model. The experimental results on IMDB show that the proposed method improves a lot in terms of accuracy with topic clustering.

Key words: sentiment analysis, topic model, recurrent neural network

摘要： 目前，大多数方法在判别文本情感极性上采用的是提取情感特征并应用分类器进行分类的方式。然而由于网络文本表述方式多样，主题分散等特点，使得情感特征提取过程变得愈发困难。借助LDA（latent Dirichlet allocation）主题模型，首先对文本进行主题聚类，然后在每个主题子类上应用循环神经网络的方法对正、负情感样本分别建立主题模型，最后基于所属主题和所属情感的概率进行联合判断。采用这种方法，通过划分子类的方式规整了不同主题下文本的表述方式，限制了不同主题下词汇词义改变的问题，并且利用训练语言模型的方法很好地规避了直接提取特征的困难，将特征的挖掘过程内化在了训练模型的过程中。通过在IMDB电影评论样本上的实验可以看出，在应用了主题聚类后，模型分类的准确性有了显著提高。

关键词: 情感分析, 主题模型, 循环神经网络

LI Tianchen, YIN Jianping. Sentiment Polarity Discrimination Method Based on Topic Clustering[J]. Journal of Frontiers of Computer Science and Technology, 2016, 10(7): 989-994.

李天辰，殷建平. 基于主题聚类的情感极性判别方法[J]. 计算机科学与探索, 2016, 10(7): 989-994.

[1]	WU Jiawei, SUN Yanchun. Recommendation System for Medical Consultation Integrating Knowledge Graph and Deep Learning Methods [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(8): 1432-1440.
[2]	LIU Jiming, ZHANG Peixiang, LIU Ying, ZHANG Weidong, FANG Jie. Summary of Multi-modal Sentiment Analysis Technology [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(7): 1165-1182.
[3]	NENG Wenpeng, LU Jun, ZHAO Caihong. Survey of Sleep Staging Based on Relational Induction Biases [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(6): 1026-1037.
[4]	WANG Xiaodong, ZHAO Yining, XIAO Haili, WANG Xiaoning, CHI Xuebin. User Behavior Analysis with RNN and Graph Neural Networks [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(5): 838-847.
[5]	CHEN Hong, YANG Yan, DU Shengdong. Research on Aspect-Level Sentiment Analysis of User Reviews [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(3): 478-485.
[6]	WANG Lewei, YU Ying, ZHANG Yinglong. Custom Generation of Poetry Based on Seq2Seq Model [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(6): 1028-1035.
[7]	WANG Shijie, ZHOU Lihua, KONG Bing, ZHOU Junhua. LDA-DeepHawkes Model for Predicting Information Cascade [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(3): 410-425.
[8]	LIU Shaoqin, TANG Shuang, ZHAO Junfeng, WANG Yasha, ZHUO Lin. Extended Topic Model Based Abnormal Medical Prescription Detection Method [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(1): 30-39.
[9]	HUANG Chang, GUO Wenzhong, GUO Kun. Research on Improved BBTM Model for Microblog Hot Topic Discovery [J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(7): 1102-1113.
[10]	ZHANG Guohao, LIU Bo. Research on Time Series Classification Using CNN and Bidirectional GRU [J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(6): 916-927.
[11]	CAO Yu, LI Tianrui, JIA Zhen, YIN Chengfeng. BGRU: New Method of Chinese Text Sentiment Analysis [J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(6): 973-981.
[12]	TANG Shuang, ZHANG Lingxiao, ZHAO Junfeng, XIE Bing, ZOU Yanzhen. Extensible Topic Modeling and Analysis Framework for Multisource Data [J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(5): 742-752.
[13]	GONG Yifan, LIU Hongyan, HE Jun, YUE Yongjiao, DU Xiaoyong. Research on Text Summarization Model with Coverage Mechanism [J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(2): 205-213.
[14]	ZHOU Kaiwen, YANG Zhihui, MA Huixin, HE Zhenying, JING Yinan, WANG X. Sean. Design and Development of Partitional Topic Model [J]. Journal of Frontiers of Computer Science and Technology, 2018, 12(7): 1036-1046.
[15]	YU Tao, LUO Ke. Sentiment Analysis with Dynamic Multi-Pooling Convolution Neural Network [J]. Journal of Frontiers of Computer Science and Technology, 2018, 12(7): 1182-1190.

Sentiment Polarity Discrimination Method Based on Topic Clustering

基于主题聚类的情感极性判别方法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles 0

Metrics