Journal of Frontiers of Computer Science and Technology ›› 2021, Vol. 15 ›› Issue (8): 1459-1468.DOI: 10.3778/j.issn.1673-9418.2007009

• Artificial Intelligence • Previous Articles     Next Articles

Study on Predicting Psychological Traits of Online Text by BERT

ZHANG Han, JIA Tianyuan, LUO Fang, ZHANG Sheng, WU Xia   

  1. 1. School of Artificial Intelligence, Beijing Normal University, Beijing 100875, China
    2. Faculty of Psychology, Beijing Normal University, Beijing 100875, China
    3. National Innovation Center for Monitoring Basic Education Quality, Beijing Normal University, Beijing 100875, China
    4. Engineering Research Center of Intelligent Technology and Educational Application, Ministry of Education, Beijing 100875, China
  • Online:2021-08-01 Published:2021-08-02

面向网络文本的BERT心理特质预测研究

张晗贾甜远骆方张生邬霞   

  1. 1. 北京师范大学 人工智能学院,北京 100875
    2. 北京师范大学 心理学部,北京 100875
    3. 北京师范大学 中国基础教育质量监测协同创新中心,北京 100875
    4. 智能技术与教育应用教育部工程研究中心,北京 100875

Abstract:

With the rapid development and popularity of the Internet, an increasing number of people would like to use online platforms to express themselves and communicate with others. It is inevitable that a large number of online text data are constantly emerging with personal information, which often indicate individual real expression in different conditions and reflect personal inner psychological traits and personality tendency. Applying text mining techniques to analyzing psychological traits behind the online text is not only helpful for individuals to understand themselves, but also useful to avoid the motivation interfere while using traditional methods for psychological assessment. In recent years, the language model named bidirectional encoder representations from transformers (BERT) has greatly improved the performance of both the text classification task and the sentiment analysis task. In this paper, prediction models for psychological traits are constructed based on online text. Comprehensive semantic features and long dependency in the context are obtained by BERT. Considering that distinct algorithm frameworks of classifiers can lead to different classification results, the fully-connected layer of the BERTBASE model and the random forest algorithm are used in the downstream classification task to make comparison. The results show that psychological traits can be effectively predicted from text classification based on BERT, and the average accuracy, average precision and other indicators are more than 97%.

Key words: bidirectional encoder representations from transformers (BERT), psychological trait, attention mechanism, Transformer, text mining

摘要:

随着互联网的普及应用,通过网络平台进行表达和交流的用户越来越多,在此过程中不可避免地会留下与个人相关的大量网络文本数据和信息,这些非结构化的文本数据往往体现着不同场景下的真实表达,反映了人们内在的心理特质及人格倾向。利用文本挖掘相关技术基于网络文本数据分析心理特质可以弥补传统心理测量方法易受应试动机等因素影响的缺陷。近年来,BERT语言表示模型在文本分类、情感分析等任务上取得了很好的效果。针对网络文本数据构建心理特质预测模型,基于BERT获取完整的上下文语义特征和长距离的上下文依赖关系;同时考虑到分类器内部结构的差异可能会导致不同的分类效果,在下游分类任务中分别采用BERTBASE模型的全连接层和经典的随机森林算法作为两种不同的分类器进行模型效果对比。结果显示,基于BERT的文本分类模型能够有效实现心理特质的预测, 平均准确率、平均精准率等各项指标都在97%以上。

关键词: BERT, 心理特质, 注意力机制, Transformer, 文本挖掘