Journal of Frontiers of Computer Science and Technology ›› 2024, Vol. 18 ›› Issue (8): 2156-2168.DOI: 10.3778/j.issn.1673-9418.2306073
• Artificial Intelligence·Pattern Recognition • Previous Articles Next Articles
LUO Shijie, JIN Rize, HAN Shuzhen
Online:
2024-08-01
Published:
2024-07-29
骆仕杰,金日泽,韩抒真
LUO Shijie, JIN Rize, HAN Shuzhen. Research on University Basic Knowledge Question-Answering Using Low-Rank Encoding to Optimize Large Language Model[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(8): 2156-2168.
骆仕杰, 金日泽, 韩抒真. 采用低秩编码优化大语言模型的高校基础知识问答研究[J]. 计算机科学与探索, 2024, 18(8): 2156-2168.
Add to citation manager EndNote|Ris|BibTeX
URL: http://fcst.ceaj.org/EN/10.3778/j.issn.1673-9418.2306073
[1] 张莹莹, 钱胜胜, 方全, 等. 基于多模态知识感知注意力机制的问答方法[J]. 计算机研究与发展, 2020, 57(5): 1037-1045. ZHANG Y Y, QIAN S S, FANG Q, et al. Question-answering method based on multimodal knowledge perception attention mechanism[J]. Journal of Computer Research and Development, 2020, 57(5): 1037-1045. [2] BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[C]//Proceedings of the 2015 International Conference on Learning Representations, San Diego, May 7-9, 2015: 1-15. [3] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, Jun 17-19, 2019. Stroudsburg: ACL, 2019: 1-15. [4] RAFFEL C, SHAZEER N, ROBERTS A, et al. Exploring the limits of transfer learning with a unified text-to-text transformer[J]. The Journal of Machine Learning Research, 2020, 21(1): 5485-5551. [5] 钱锦, 黄荣涛, 邹博伟, 等. 基于多任务学习的生成式阅读理解[J]. 中文信息学报, 2021, 35(12): 103-111. QIAN J, HUANG R T, ZOU B W, et al. Generative reading comprehension based on multitask learning[J]. Journal of Chinese Information Processing, 2021, 35(12): 103-111. [6] 梅佳蒙, 任延珍, 王丽娜. 安全性可控的生成式文本隐写算法[J]. 网络与信息安全学报, 2022, 8(3): 53-65. MEI J M, REN Y Z, WANG L N. Secure and controllable generative text steganography algorithm[J]. Journal of Cybersecurity and Information Security, 2022, 8(3): 53-65. [7] 刘家, 卢永美, 何东, 等. 面向语义多样性的对话生成模型[J]. 小型微型计算机系统, 2022, 43(10): 2028-2034. LIU J, LU Y M, HE D, et al. Dialogue generation model for semantic diversity[J]. Journal of Chinese Computer Systems, 2022, 43(10): 2028-2034. [8] BROWN T B, MANN B, RYDER N, et al. Language models are few-shot learners[J]. Nature, 2020, 581(7809): 1-9. [9] KAPLAN J, MCCANDLISH S, HENIGHAN T, et al. Scaling laws for neural language models[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, Nov 16-22, 2020. Stroudsburg: ACL, 2020: 6174-6184. [10] HOFFMANN J, BORGEAUD S, MENSCH A, et al. Training compute-optimal large language models[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, Dec 6-10, 2022. Stroudsburg: ACL, 2022: 7-11. [11] OUYANG L, WU J, JIANG X, et al. Training language models to follow instructions with human feedback[C]//Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Seattle, Jul 11-27, 2022. Stroudsburg: ACL, 2022: 1837-1851. [12] GLAESE A, MCALEESE N, TR?BACZ M, et al. Improving alignment of dialogue agents via targeted human judgements[C]//Proceedings of the 2022 Conference of the North Amer-ican Chapter of the Association for Computational Linguistics: Human Language Technologies, Seattle, Jul 11-27, 2022. Stroudsburg: ACL, 2022: 1746-1757. [13] KNOX W B, STONE P. Augmenting reinforcement learning with human feedback[C]//Proceedings of the 2011 Workshop on New Developments in Imitation Learning, Scotland, Oct 11-28, 2011. New York: ACM, 2011: 1-6. [14] THOPPILAN R, DE FREITAS D, HALL J, et al. LaMDA: language models for dialog applications[C]//Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Seattle, Jul 11-27, 2022. Stroudsburg: ACL, 2022: 364-367. [15] TOUVRON H, LAVRIL T, IZACARD G, et al. LLaMA: open and efficient foundation language models[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, Dec 6-10, 2022. Stroudsburg: ACL, 2022: 15-21. [16] BAI Y, KADAVATH S, KUNDU S, et al. Constitutional AI: harmlessness from AI feedback[EB/OL]. [2023-05-25]. https://doi.org/10.48550/arXiv.2212.08073. [17] SMITH S, PATWARY M, NORICK B, et al. Using deepspeed and megatron to train megatron-turing NLG 530B, a large-scale generative language model[EB/OL]. [2023-05-25]. https://doi.org/10.48550/arXiv.2201.11990. [18] DU Z, QIAN Y, LIU X, et al. GLM: general language model pretraining with autoregressive blank infilling[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, May 7-15, 2022. Strouds-burg: ACL, 2022: 320-335. [19] SUN Y, WANG S, FENG S, et al. ERNIE 3.0: large-scale knowledge enhanced pre-training for language understanding and generation[EB/OL]. [2023-05-25]. https://doi.org/10.48550/arXiv.2107.02137. [20] RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving language understanding by generative pre-training[J]. OpenAI Blog, 2018, 1(8): 9. [21] RADFORD A, WU J, CHILD R, et al. Language models are unsupervised multitask learners[J]. OpenAI Blog, 2019, 1(8): 9. [22] SCHICK T, SCHüTZE H. Exploiting cloze questions for few shot text classification and natural language inference[C]//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Jun 8-18, 2021. Stroudsburg: ACL, 2021: 3295-3308. [23] HOULSBY N, GIURGIU A, JASTRZEBSKI S, et al. Parameter-efficient transfer learning for NLP[C]//Proceedings of the 36th International Conference on Machine Learning, Long Bearch, Jun 10-15, 2019. New York: ACM, 2019: 2790-2799. [24] REBUFFI S A, BILEN H, VEDALDI A. Learning multiple visual domains with residual adapters[C]//Advances in Neural Information Processing Systems 30, Long Beach, Dec 4-9, 2017: 2958-2966. [25] LI B, WANG Z, CHEN J, et al. Training neural networks with low-precision model memory[C]//Proceedings of the 17th International Conference on Learning Representations, Seattle, Sep 22-29, 2022. Stroudsburg: ACL, 2022: 5223-5232. [26] DUBOIS Y, LI X, TAORI R, et al. Alpacafarm: a simulation framework for methods that learn from human feedback[EB/OL]. [2023-05-25]. https://doi.org/10.48550/arXiv.2305.14387. [27] LIU H, LI Z, HALL D, et al. Sophia: a scalable stochastic second-order optimizer for language model pre-training[EB/OL]. [2023-05-25]. https://doi.org/10.48550/arXiv.2305.14342. [28] LI C Y, FARKHOOR H, LIU R, et al. Measuring the intrinsic dimension of objective landscapes[C]//Proceedings of the 2018 International Conference on Learning Representations, Apr 30-May 3, 2018: 1-15. [29] HU E J, SHEN Y, WALLIS P, et al. LoRA: low-rank adaptation of large language models[C]//Proceedings of the 10th International Conference on Learning Representations, Apr 25-29, 2021: 1-16. [30] DETTMERS T, PAGNONI A, HOLTZMAN A, et al. QLoRA: efficient finetuning of quantized LLMs[EB/OL]. [2023-05-25]. https://doi.org/10.48550/arXiv.2305.14314. [31] AGHAJANYAN A, ZETTLEMOYER L, GUPTA S. Intrinsic dimensionality explains the effectiveness of language model fine-tuning[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, Nov 11-15, 2020. Stroudsburg: ACL, 2020: 2532-2545. [32] SUN Z, WANG X, TAY Y, et al. Recitation-augmented langu-age models[EB/OL]. [2023-05-25]. https://doi.org/10.48550/arXiv.2210.01296. [33] LI X L, LIANG P. Prefix-tuning: optimizing continuous prompts for generation[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Aug 7-12, 2021. Stroudsburg: ACL, 2021: 4582-4597. [34] LESTER B, AL-RFOU R, CONSTANT N. The power of scale for parameter-efficient prompt tuning[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Nov 11-16, 2021. Stroudsburg: ACL, 2021: 3045-3059. [35] SUNG Y L, CHO J, BANSAL M. LST: ladder side-tuning for parameter and memory efficient transfer learning[C]//Advances in Neural Information Processing Systems 35, New Orleans, Nov 28-Dec 9, 2022: 34-46. [36] WARSTADT A, SINGH A, BOWMAN S R. Neural network acceptability judgments[J]. Transactions of the Association for Computational Linguistics, 2019, 7: 625-641. [37] SOCHER R, PERELYGIN A, WU J, et al. Recursive deep models for semantic compositionality over a sentiment treebank[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, Oct 1-13, 2013. Stroudsburg: ACL, 2013: 1631-1642. [38] DOLAN W B, BROCKETT C. Automatically constructing a corpus of sentential paraphrases[C]//Proceedings of the 3rd International Workshop on Paraphrasing, Beijing, Sep 12-13, 2005. Berlin, Heidelberg: Springer, 2005: 92-98. [39] WILLIAMS A, NANGIA N, BOWMAN S. A broad-coverage challenge corpus for sentence understanding through inference[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, Jun 27-19, 2018. Stroudsburg: ACL, 2018: 1112-1122. [40] WANG A, SINGH A, MICHAEL J, et al. GLUE: a multi-task benchmark and analysis platform for natural language understanding[C]//Proceedings of the 2018 Empirical Methods in Natural Language Processing Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, Brussels, Nov 18-21, 2018. Stroudsburg: ACL, 2018: 353-355. [41] DAGAN I, GLICKMAN O, MAGNINI B. The PASCAL recognising textual entailment challenge[C]//Proceedings of the 2006 Machine Learning Challenges Workshop, Berlin, Jun 17-19, 2006. Berlin, Heidelberg: Springer, 2006: 177-190. [42] CER D, DIAB M, AGIRRE E, et al. SemEval-2017 task 1: semantic textual similarity multilingual and crosslingual focused evaluation[C]//Proceedings of the 11th International Workshop on Semantic Evaluation, Vancouver, Aug 19-23, 2017. Stroudsburg: ACL, 2017: 1-14. |
[1] | XIANG Xiaowei, SHEN Yanguang, HU Minghao, YAN Tianwei, LUO Wei, LUO Zhunchen. Research on Science and Technology Policy and Regulation Q&A System Driven by Large Models [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(9): 2349-2360. |
[2] | LI Yifei, ZHANG Lingling, DONG Yuxuan, WANG Jiaxin, ZHONG Yujie, WEI Bifan. Large Language Model Augmentation and Feature Alignment Method for Few-Shot Continual Relation Extraction [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(9): 2326-2336. |
[3] | JI Guiyang, WANG Peiyan, YU Zhuo. Research on Knowledge Injection Method for Large Language Model Oriented to Process Specification Texts [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(9): 2361-2369. |
[4] | CHEN Longfei, GAO Xin, HOU Haotian, YE Chuyang, LIU Ya'ou, ZHANG Meihui. Application of Generative Large Language Models in Chinese Radiology Domain [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(9): 2337-2348. |
[5] | SHENG Lei, CHEN Xiliang, LAI Jun. Offline Multi-agent Reinforcement Learning Method Based on Latent State Distribution GPT [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(8): 2169-2179. |
[6] | CHEN Zhongyong, HUANG Yongsheng, ZHANG Min, JIANG Ming. Study on Entity Extraction Method for Pharmaceutical Instructions Based on Pretrained Models [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(7): 1911-1922. |
[7] | ZHANG Qi, ZHONG Hao. Submodular Optimization Approach for Entity Summarization in Knowledge Graph Driven by Large Language Models [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(7): 1806-1813. |
[8] | YUAN Heng, GENG Yikun. Feature Refinement and Multi-scale Attention for Transformer Image Denoising Network [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(7): 1838-1851. |
[9] | CHEN Dongyang, MAO Li. Research on Stock Price Prediction Integrating Incremental Learning and Transformer Model [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(7): 1889-1899. |
[10] | FENG Jun, CHANG Yanghong, LU Jiamin, TANG Hailin, LYU Zhipeng, QIU Yuchun. Construction and Application of Knowledge Graph for Water Engineering Scheduling Based on Large Language Model [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(6): 1637-1647. |
[11] | ZHANG Kaili, WANG Anzhi, XIONG Yawei, LIU Yun. Survey of Transformer-Based Single Image Dehazing Methods [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(5): 1182-1196. |
[12] | CHEN Qian, HONG Zheng, SI Jianpeng. Application Layer Protocol Recognition Incorporating SENet and Transformer [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(3): 805-817. |
[13] | XUE Jinqiang, WU Qin. Lightweight Cross-Gating Transformer for Image Restoration and Enhancement#br# #br# [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(3): 718-730. |
[14] | PENG Bin, BAI Jing, LI Wenjing, ZHENG Hu, MA Xiangyu. Survey on Visual Transformer for Image Classification [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(2): 320-344. |
[15] | WANG Qiang, LU Xianling. Transformer Object Tracking Algorithm Based on Spatio-Temporal Template Update [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(9): 2161-2173. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
/D:/magtech/JO/Jwk3_kxyts/WEB-INF/classes/