Study on Application of Transfer Learning in Entity Recognition of Low Resource Environment

doi:10.3778/j.issn.1673-9418.2107097

Abstract

Abstract: Entity recognition is a basic work in information extraction. At present, how to recognize entities in low resource environment is still a challenging task in natural language processing. Combined with the pre-training model, a solution of “unified coding separate decoding” is adopted, which can learn the abstract boundary information of large-scale domain entities, and transfer the abstract boundary information of entities to low resource scenarios based on transfer learning. The model can effectively improve the accuracy of entity recognition tasks in low resource environment. Different from the existing methods, the feature vector is adapted only before the process of decoding. An adaptive module is designed to decode separately each feature vector obtained by the unified coding method，according to the entity type and annotation mode dimension of the target domain, determining how each entity is dimensioned, to avoid complex entity embedding problems. Experimental results based on public datasets show that: compared with the baseline model of BERT-BiLSTM-CRF, Precision is increased by 4 percentage points, Recall is increased by 5.4 percentage points, and F1 is increased by 4.72 percentage points in the low resource scenario in the pharmaceutical field; in the low resource scenario in the personnel field, Precision is increased by 31.91 percentage points, Recall is increased by 31.7 percentage points, and F1 is increased by 31.86 percentage points. Experimental results based on autonomously collected and collated datasets also show the effectiveness of the model for entity recognition in low-resource scenarios, with improved accuracy and recall compared with Lattice-BERT model.

Key words: transfer learning, entity recognition, low resource scenario, sequence annotation

摘要： 实体识别是信息抽取工作中的一项基础性工作。目前在缺乏足够的标注语料的低资源场景下如何有效识别实体，仍是自然语言处理中的一项挑战性工作。结合预训练模型，采用一种“统一编码-分离解码”解决方案，学习大规模领域实体抽象边界信息，基于迁移学习，将大规模领域实体边界抽象信息迁移到低资源场景，提高低资源场景实体识别精度。与现有方法不同的是，仅在解码前对特征向量进行适配。设计了一种自适应模块对统一编码方式得到的每一特征向量按照目标域的实体类型和标注方式维度进行单独解码，确定每个实体的标注方式，避免复杂的实体嵌套问题。基于公开数据集的实验结果表明：相较于BERT-BiLSTM-CRF基线模型，在医药领域低资源场景下，精确率提高4个百分点，召回率提高5.4个百分点，[F1]提高4.72个百分点；在人事领域低资源场景下，精确率提高31.91个百分点，召回率提高31.7个百分点，[F1]提高31.86个百分点。基于自主采集整理数据集的实验结果也表明了模型在低资源场景下进行实体识别的有效性，相较于Lattice-BERT模型，在精确率、召回率等方面有所提高。

关键词: 迁移学习, 实体识别, 低资源场景, 序列标注

DU Peng, ZHANG Youming, ZHU Zhengzhou, LI Guocai. Study on Application of Transfer Learning in Entity Recognition of Low Resource Environment[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(4): 912-921.

杜鹏, 张有明, 朱郑州, 李国才. 迁移学习在低资源场景实体识别中的应用研究[J]. 计算机科学与探索, 2023, 17(4): 912-921.

References

[1] LAMPLE G, BALLESTEROS M, SUBRAMANIAN S, et al. Neural architectures for named entity recognition[C]//Pro-ceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, Jun 12-17, 2016. Stroudsburg: ACL, 2016: 260-270.
[2] MA X Z, HOVY E H. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Ling- uistics, Berlin, Aug 7-12, 2016. Stroudsburg: ACL, 2016: 1064-1074.
[3] ZHANG B L, WHITEHEAD S, HUANG L F, et al. Global attention for name tagging[C]//Proceedings of the 22nd Conference on Computational Natural Language Learning, Brussels, Oct 31-Nov 1, 2018. Stroudsburg: ACL, 2018: 86-96.
[4] LIU L Y, SHANG J B, REN X, et al. Empower sequence labeling with task-aware neural language model[C]//Pro-ceedings of the 32nd AAAI Conference on Artificial Intel-ligence, the 30th Innovative Applications of Artificial Intelligence, and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence, New Orleans, Feb 2-7, 2018. Menlo Park: AAAI, 2018: 5253-5260.
[5] PETERS M E, NEUMANN M, IYYER M, et al. Deep contextualized word representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, Jun 1-6, 2018. Menlo Park:AAAI, 2018: 2227-2237.
[6] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Com-putational Linguistics: Human Language Technologies, Minneapolis, Jun 2-7, 2019. Menlo Park: AAAI, 2019: 4171-4186.
[7] LIN B Y, LU W. Neural adaptation layers for cross-domain named entity recognition[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Oct 31-Nov 4, 2018. Stroudsburg:ACL, 2018: 2012-2022.
[8] DIAO S Z, XU R J, SU H J, et al. Taming pre-trained language models with N-gram representations for low-resource domain adaptation[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Aug 1-6, 2021. Stroudsburg:ACL, 2021: 3336-3349.
[9] LI X N, YAN H, QIU X P, et al. FLAT: Chinese NER using flat-lattice transformer[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Jul 5-10, 2020. Stroudsburg: ACL, 2020: 6836-6842.
[10] LAI Y X, LIU Y J, FENG Y S, et al. Lattice-BERT: leveraging multi-granularity representations in Chinese pre-trained language models[C]//Proceedings of the 2021 Con-ference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Jun 6-11, 2021. Stroudsburg: ACL, 2021: 1716-1731.
[11] FENG X C, FENG X C, QIN B, et al. Improving low resource named entity recognition using cross-lingual know-ledge transfer[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Jul 13-19, 2018: 4071-4077.
[12] HUANG L F, JI H, MAY J. Cross-lingual multi-level adversarial transfer to enhance low-resource name tagging [C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis,Jun 2-7, 2019. Stroudsburg: ACL, 2019: 3823-3833.
[13] KRUENGKRAI C, NGUYEN T H, ALJUNIED S M, et al. Improving low-resource named entity recognition using joint sentence and token labeling[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Jul 5-10, 2020. Stroudsburg: ACL, 2020: 5898-5905.
[14] LIN Y, YANG S Q, STOYANOV V, et al. A multi-lingual multi-task architecture for low-resource sequence labeling[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Jul 15-20, 2018. Stroudsburg: ACL, 2018: 799-809.
[15] ZHANG S, LI S F, JIANG F G, et al. Recognizing small-sample biomedical named entity based on contextual domain relevance[C]//Proceedings of the 2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control, Chengdu, Mar 15-17, 2019. Piscataway: IEEE, 2019: 1509-1516.
[16] JU M Z, MIWA M, ANANIADOU S. A neural layered model for nested named entity recognition[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, Jun 1-6, 2018. Stroudsburg: ACL, 2018: 1446-1459.
[17] SOHRAB M G, MIWA M. Deep exhaustive model for nested named entity recognition[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Oct 31-Nov 4, 2018. Stroudsburg: ACL, 2018: 2843-2849.
[18] LUO Y, ZHAO H. Bipartite flat-graph network for nested named entity recognition[J]. arXiv:2005.00436, 2020.
[19] HARRIS Z S. Distributional structure[J]. Word, 1954, 10(2/3): 146-162.
[20] CHURCH K W. Word2Vec[J]. Natural Language Engineering, 2017, 23(1): 155-162.
[21] PENNINGTON J, SOCHER R, MANNING C D. GloVe: global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Oct 25-29, 2014. Stroudsburg: ACL, 2014: 1532-1543.
[22] XU L, TONG Y, DONG Q Q, et al. CLUENER2020: fine-grained named entity recognition dataset and benchmark for Chinese[J]. arXiv:2001.04351, 2020.
[23] LEVOW G A. The third international Chinese language processing bakeoff: word segmentation and named entity recognition[C]//Proceedings of the 5th Workshop on Chinese Language Processing, Sydney, Jul 22-23, 2006. Stroudsburg: ACL, 2006: 108-117.
[24] ZHANG Y, YANG J. Chinese NER using lattice LSTM[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Jul 15-20, 2018.Stroudsburg: ACL, 2018: 1554-1564.