计算机科学与探索 ›› 2020, Vol. 14 ›› Issue (8): 1261-1274.DOI: 10.3778/j.issn.1673-9418.2002020
赵鹏飞,李艳玲,林民
出版日期:
2020-08-01
发布日期:
2020-08-07
ZHAO Pengfei, LI Yanling, LIN Min
Online:
2020-08-01
Published:
2020-08-07
摘要:
口语理解(SLU)是人机对话系统的重要部分,意图识别作为口语理解的一个子任务,因其可以为限定领域的对话扩展领域而处于非常重要的地位。由于实际应用领域的对话系统需求增加,而需要开发的新领域短时间内又无法获得大量数据,因此为搭建新领域的深度学习模型提出了挑战。迁移学习是深度学习的一种特殊应用,在迁移学习中,能够利用源域和目标域完成对只有少量标注数据的目标域模型的构建,通过对源域和目标域之间的知识迁移完成学习过程。利用已有领域的标注数据和模型,搭建只含有少量标注数据的新领域对话系统是当前的研究重点。主要针对意图识别任务进行概述,对迁移学习的方法进行分类和阐述,并总结其问题和解决思路,进一步思考如何将迁移学习应用于意图识别任务,从而推动少量数据的新领域意图识别研究。
赵鹏飞,李艳玲,林民. 面向迁移学习的意图识别研究进展[J]. 计算机科学与探索, 2020, 14(8): 1261-1274.
ZHAO Pengfei, LI Yanling, LIN Min. Research Progress on Intent Detection Oriented to Transfer Learning[J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(8): 1261-1274.
[1] Liu J, Li Y L, Lin M. Review of intent detection methods in human-machine dialogue system[J]. Computer Engineering and Applications, 2019, 55(12): 1-7. 刘娇, 李艳玲, 林民. 人机对话系统中意图识别方法综述[J]. 计算机工程与应用, 2019, 55(12): 1-7. [2] Hou L X, Li Y L, Li C C. Review of research on task-oriented spoken language understanding[J]. Computer Engineering and Applications, 2019, 55(11): 7-15. 侯丽仙, 李艳玲, 李成城. 面向任务口语理解研究现状综述[J]. 计算机工程与应用, 2019, 55(11): 7-15. [3] Moldovan D, Pasca M, Marabagiu S, et al. Performance issues and error analysis in an open-domain question answering system[J]. ACM Transactions on Information Systems, 2003, 21(2): 133-154. [4] Zhang X D, Wang H F. A joint model of intent determination and slot filling for spoken language understanding[C]//Proceedings of the 25th International Joint Conference on Artificial Intelligence, New York, Jul 9, 2016: 2993-2999. [5] Liu B, Lane I. Attention-based recurrent neural network models for joint intent detection and slot filling[J]. arXiv:1609.01454, 2016. [6] Yao L, Mao C S, Luo Y, et al. Graph convolutional networks for text classification[C]//Proceedings of the 2019 AAAI Conference on Artificial Intelligence, Hilton Hawaiian Village, Jan 27-Feb 1, 2019: 7370-7377. [7] Devlin J, Chang M, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[J]. arXiv:1810.04805, 2018. [8] Kim J K, Tur G, Celikyilmaz A, et al. Intent detection using semantically enriched word embeddings[C]//Proceedings of the 2016 IEEE Spoken Language Technology Workshop, San Diego, Dec 13-16, 2016: 414-419. [9] Fellbaum C, Miller G. Word Net: an electronic lexical data-base[J]. Library Quarterly Information Community Policy, 1998, 25(2): 292-296. [10] Hou L X, Li Y L, Lin M, et al. Joint recognition of intent and semantic slot filling combining multiple constraints[J/OL]. Journal of Frontiers of Computer Science and Technology (2019-11-18) [2019-12-08]. http://kns.cnki.net/kcms/detail/11. 5602.tp.20191118.1114.008.html. 侯丽仙, 李艳玲, 林民, 等. 融合多约束条件的意图和语义槽填充联合识别[J/OL]. 计算机科学与探索(2019-11-18)[2019-12-08]. http://kns.cnki.net/kcms/detail/11.5602.TP. 20191118.1114.008.html. [11] Yang Z M, Wang L Q, Wang Y. Questions intent classification based on dual channel convolutional neural network[J]. Journal of Chinese Information Processing, 2019, 33(5): 122-131. 杨志明, 王来奇, 王泳. 基于双通道卷积神经网络的问句意图分类研究[J]. 中文信息学报, 2019, 33(5): 122-131. [12] Yang Z M, Wang L Q, Wang Y. Application research of deep learning algorithm in question intention classification[J]. Computer Engineering and Applications, 2019, 55(10): 154-160. 杨志明, 王来奇, 王泳. 深度学习算法在问句意图分类中的应用研究[J]. 计算机工程与应用, 2019, 55(10): 154-160. [13] Yang C N, Feng C S. Multi-intention recognition model with combination of syntactic feature and convolution neural network[J]. Journal of Computer Applications, 2018, 38(7): 1839-1845. 杨春妮, 冯朝胜. 结合句法特征和卷积神经网络的多意图识别模型[J]. 计算机应用, 2018, 38(7): 1839-1845. [14] Hinton G, Krizhevsky A, Wang S. Transforming auto- encoders[C]//Proceedings of the 21st International Conference on Artificial Neural Networks, Espoo, Jun 14-17, 2011. Berlin, Heidelberg: Springer, 2011: 44-51. [15] Zhao W, Ye J B, Yang M, et al. Investigating capsule networks with dynamic routing for text classification[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Oct 31-Nov 4, 2018. Stroudsburg: ACL, 2018: 3110-3119. [16] Liu J, Li Y L, Lin M. Capsule network is used in the research of short text multi-intent detection[J/OL]. Journal of Frontiers of Computer Science and Technology (2020-03-04) [2020-03-06]. http://kns.cnki.net/kcms/detail/11.5602.TP.20200303.1659. 006.html. 刘娇, 李艳玲, 林民. 胶囊网络用于短文本多意图识别的研究[J/OL]. 计算机科学与探索(2020-03-04) [2020-03-06]. http://kns.cnki.net/kcms/detail/11.5602.TP.20200303.1659.006.html. [17] Li Y H, Liang S C, Ren J, et al. Text classification method based on recurrent neural network variants and convolutional neural network[J]. Journal of Northwest University (Natural Science Edition), 2019, 49(4): 573-579. 李云红, 梁思程, 任劼, 等. 基于循环神经网络变体和卷积神经网络的文本分类方法[J]. 西北大学学报(自然科学版), 2019, 49(4): 573-579. [18] Pan S J, Yang Q. A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(10): 1345-1359. [19] Weiss K, Khoshgoftaar T, Wang D D, et al. A survey of transfer learning[J]. Journal of Big Data, 2016, 3(1): 9. [20] Zhuang F Z, Qi Z Y, Duan K Y, et al. A comprehensive survey on transfer learning[J]. arXiv:1911.02685, 2019. [21] Cao Y, Fang M, Yu B S, et al. Unsupervised domain adaptation on reading comprehension[J]. arXiv:1911.06137, 2019. [22] Li B, Wang X Y, Beigi H. Cantonese automatic speech recognition using transfer learing from mandarin[J]. arXiv:1911.09271, 2019. [23] Tarcar A K, Tiwari A, Rao D, et al. Healthcare NER models using language model pretraining[J]. arXiv:1910.11241, 2019. [24] Sarma P K, Liang Y Y, Setharse W A. Shallow domain adaptive embeddings for sentiment analysis[J]. arXiv:1908. 06082, 2019. [25] Baalouch M, Poli J P, Defurne M, et al. Sim-to-real domain adaptation for high energy physics[J]. arXiv:1912.08001, 2019. [26] Tan C Q, Sun F C, Tao K, et al. A survey of deep transfer learning[C]//Proceedings of the 27th International Conference on Artificial Neural Networks, Rhodes, Oct 4-7, 2018: 270-279. [27] Justin J, Alexandre A, Li F F. Perceptual losses for real-time style transfer and super-resolution[J]. arXiv:1603.08155, 2016. [28] Mccann B, Bradbury J, Xiong C, et al. Learned in translation: contextualized word vectors[J]. arXiv:1708.00107, 2017. [29] Chowdhury S, Annervaz K M, Dukkipati A. Instance-based inductive deep transfer learning by cross-dataset querying with locality sensitive hashing[J]. arXiv:1802.05934, 2018. [30] Wang T, Huan J, Zhu M. Instance-based deep transfer learning[J]. arXiv:1809.02776, 2019. [31] Chen Z, Qian T Y. Transfer capsule network for aspect level sentiment classification[C]//Proceedings of the 57th Annual Meeting of the ACL, Florence, Jul 28-Aug 2, 2019. Stroudsburg: ACL, 2019: 547-556. [32] Jiang S Y, Xu Y H, Wang T Y, et al. Multi-label metric transfer learning jointly considering instance space and label space distribution divergence[J]. IEEE Access, 2019, 7: 10362-10373. [33] Ben-David S, Blitzer J, Crammer K, et al. A theory of learning from different domains[J]. Machine Learning, 2010, 79(1/2): 151-175. [34] Wu Y W, Li B, Sun C H, et al. Research on domain adaptive recommendation methods based on transfer learning[J]. Computer Engineering and Applications, 2019, 55(13): 59-65.吴彦文, 李斌, 孙晨辉, 等. 基于迁移学习的领域自适应推荐方法研究[J]. 计算机工程与应用, 2019, 55(13):59-65. [35] Long M S, Wang J M, Jordan M, et al. Deep transfer learning with joint adaptation networks[J]. arXiv:1605.06636, 2016. [36] Jia Y L, Han D H, Lin H Y, et al. Consumption intent recognition algorithms for Weibo users[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2020(1): 68-74. 贾云龙, 韩东红, 林海原, 等. 面向微博用户的消费意图识别算法[J]. 北京大学学报(自然科学版), 2020(1): 68-74. [37] Mou L L, Zhao M, Yan R, et al. How transferable are neural networks in NLP appications?[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Nov 1-5, 2016: 479-489. [38] An M H, Shen C L, Li S S, et al. Joint learning for sentiment classification towards question-answering reviews[J]. Journal of Chinese Information Processing, 2019, 33(10): 119-126. 安明慧, 沈忱林, 李寿山, 等. 基于联合学习的问答情感分类方法[J]. 中文信息学报, 2019, 33(10): 119-126. [39] Wang L W, Li J M, Zhou G M, et al. Application of deep transfer learning in hyperspectral image classification[J]. Computer Engineering and Applications, 2019, 55(5): 181-186.王立伟, 李吉明, 周国民, 等. 深度迁移学习在高光谱图像分类中的运用[J]. 计算机工程与应用, 2019, 55(5): 181-186. [40] Chawla N, Japkowicz N, Kolcz A. Editorial: special issue on learning from imbalanced data sets[J]. ACM SIGKDD Explorations Newsletter, 2004, 6(1): 1-6. [41] Joshi M, Agarwal R, Kumar V. Learning classifier models for predicting rare phenomena[D]. University of Minnesota, 2002. [42] Xiao Z, Wang L, Du J Y. Improving the performance of sentiment classification on imbalanced datasets with transfer learning[J]. IEEE Access, 2019, 7: 28281-28290. [43] Liu X Y, Wu J X, Zhou Z H. Exploratory undersampling for class-imbalance learning[J]. IEEE Transactions on Cybernetics, 2009, 39(2): 539-550. [44] Semwal T, Yenigalla P, Mathur G, et al. A practitioners’ guide to transfer learning for text classifification using convolutional neural networks[C]//Proceedings of the 2018 SIAM International Conference on Data Mining, San Diego, May 3-5, 2018. Philadelphia: SIAM, 2018: 513-521. [45] Qiu N J, Wang X X, Wang P, et al. Research on convolutional neural network algorithm combined with transfer learning model[J]. Computer Engineering and Applications, 2020, 56(5): 43-48. 邱宁佳, 王晓霞, 王鹏, 等. 结合迁移学习模型的卷积神经网络算法研究[J]. 计算机工程与应用, 2020, 56(5): 43-48. [46] Goodfellow I J, Jean P, Mirza M, et al. Generative adversarial networks[J]. arXiv:1406.2261, 2014. [47] Long M S, Cao Z J, Wang J M, et al. Domain adaptation with randomized multilinear adversarial networks[J]. arXiv:1705.10667, 2017. [48] Tzeng E, Hoffman J, Darrell T, et al. Simultaneous deep transfer across domains and tasks[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision, Santiago, Dec 7-13, 2015. Piscataway: IEEE, 2015: 173-187. [49] Tzeng E, Hoffman J, Saenko K, et al. Adversarial discriminative domain adaptation[C]//Proceedings of the 2017 IEEE International Conference on Data Engineering, San Diego, Apr 19-22, 2017: 4. [50] Luo Z L, Zou Y L, Li F F, et al. Label efficient learning of transferable representations acrosss domains and tasks[C]//Proceedings of the Advances in Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 164-176. [51] Ajakan H, Germain P, Larochelle H, et al. Domain-adversarial neural networks[J]. arXiv:1412.4446, 2014. [52] Ganin Y, Ustinova E, Ajakan H, et al. Domain-adversarial training of neural network[J]. arXiv:1505.07818, 2015. [53] Ganin Y, Victor L. Unsupervised domain adaptation by backpropagation[J]. arXiv:1409.7495v2, 2014. [54] Lin Y, Qian T Y. Cross-domain sentiment classification by capsule network[J]. Journal of Nanjing University of Information Science and Technology (Natural Science Edition), 2019, 11(3): 286-294.林悦, 钱铁云. 基于胶囊网络的跨领域情感分类方法[J]. 南京信息工程大学学报(自然科学版), 2019, 11(3): 286-294. |
[1] | 王迪聪,白晨帅,邬开俊. 基于深度学习的视频目标检测综述[J]. 计算机科学与探索, 2021, 15(9): 1563-1577. |
[2] | 张晓旭,马志强,刘志强,朱方圆,王春喻. Transformer在语音识别任务中的研究现状与展望[J]. 计算机科学与探索, 2021, 15(9): 1578-1594. |
[3] | 陈璠,彭力. 多层级重叠条纹特征融合的行人重识别[J]. 计算机科学与探索, 2021, 15(9): 1753-1761. |
[4] | 武家伟,孙艳春. 融合知识图谱和深度学习方法的问诊推荐系统[J]. 计算机科学与探索, 2021, 15(8): 1432-1440. |
[5] | 马煜,杜慧敏,毛智礼,张霞. 深度语义分割人群密度检测技术[J]. 计算机科学与探索, 2021, 15(8): 1469-1475. |
[6] | 荣欢,马廷淮. 利用收益预测与策略梯度两阶段众包评论集成[J]. 计算机科学与探索, 2021, 15(8): 1476-1489. |
[7] | 马玉琨,徐姚文,赵欣,徐涛,王泽瑞. 人脸识别系统的活体检测综述[J]. 计算机科学与探索, 2021, 15(7): 1195-1206. |
[8] | 葛轶洲,许翔,杨锁荣,周青,申富饶. 序列数据的数据增强方法综述[J]. 计算机科学与探索, 2021, 15(7): 1207-1219. |
[9] | 方钧婷,谭晓阳. 注意力级联网络的金属表面缺陷检测算法[J]. 计算机科学与探索, 2021, 15(7): 1245-1254. |
[10] | 田萱,丁琪,廖子慧,孙国栋. 基于深度学习的新闻推荐算法研究综述[J]. 计算机科学与探索, 2021, 15(6): 971-998. |
[11] | 能文鹏,陆军,赵彩虹. 基于关系归纳偏置的睡眠分期综述[J]. 计算机科学与探索, 2021, 15(6): 1026-1037. |
[12] | 吕昊远,俞璐,周星宇,邓祥. 半监督深度学习图像分类方法研究综述[J]. 计算机科学与探索, 2021, 15(6): 1038-1048. |
[13] | 马宇,张丽果,杜慧敏,毛智礼. 卷积神经网络的交通标志语义分割[J]. 计算机科学与探索, 2021, 15(6): 1114-1121. |
[14] | 汤凌燕,熊聪聪,王嫄,周宇博,赵子健. 基于深度学习的短文本情感倾向分析综述[J]. 计算机科学与探索, 2021, 15(5): 794-811. |
[15] | 刘靖祎,史彩娟,涂冬景,刘帅. 零样本图像分类综述[J]. 计算机科学与探索, 2021, 15(5): 812-824. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||