计算机科学与探索 ›› 2024, Vol. 18 ›› Issue (8): 2156-2168.DOI: 10.3778/j.issn.1673-9418.2306073

• 人工智能·模式识别 • 上一篇    下一篇

采用低秩编码优化大语言模型的高校基础知识问答研究

骆仕杰,金日泽,韩抒真   

  1. 1. 天津工业大学 网络安全和信息化办公室,天津 300387
    2. 天津工业大学 软件学院,天津 300387
  • 出版日期:2024-08-01 发布日期:2024-07-29

Research on University Basic Knowledge Question-Answering Using Low-Rank Encoding to Optimize Large Language Model

LUO Shijie, JIN Rize, HAN Shuzhen   

  1. 1. Office of the Cyberspace Affairs, Tiangong University, Tianjin 300387, China
    2. School of Software, Tiangong University, Tianjin 300387, China
  • Online:2024-08-01 Published:2024-07-29

摘要: 在高等教育领域,基础知识问答系统对学生学术成就提升及教育资源公平分配具有重要作用。近年来已有基于预训练语言模型上使用机器阅读理解和文本相似度匹配的问答技术,在处理复杂的自然语言问题时仍然面临因训练数据不足、模型泛化能力限制等瓶颈导致的回答质量和准确性不足的情况。本研究旨在解决如何在降低资源消耗的同时,提升基础知识问答系统在高校环境中的性能优势和准确率。为实现该目标,提出了一种高校基础知识领域的低秩编码大语言模型微调方法。该方法通过低秩编码的方法降低大语言模型的内存、显存在训练和预测的消耗量,并且运用大语言模型的生成式方法优化我校基础知识数据问答领域的研究与分析,从而提高日常基础知识问答的质量、准确性和响应速度。通过冻结大型预训练模型权重,将高校基础知识语言信息融入原Transformer架构的预训练层,并且加入了问答优化模块来规范生成式模型的准确性。此方法在显著减少下游任务可训练参数数量的同时,可以较好地保留原模型的生成式语言能力,并且针对高校基础知识领域展现出更优的性能优势和准确率。

关键词: 生成式语言模型, 基础知识问答, 大语言模型, Transformer, 冻结模型权重

Abstract: In the field of higher education, foundational knowledge question-answering (QA) systems play a crucial role in enhancing students’ academic performance and facilitating equitable distribution of educational resources. In recent years, question-answering techniques based on machine reading comprehension and text similarity matching have been developed atop pre-trained language models. However, when addressing complex natural language problems, these techniques still face challenges in answer quality and accuracy due to limitations like insufficient training data and restricted model generalization capabilities. This research aims to address the dual objective of reducing resource consumption while simultaneously enhancing the performance and accuracy of basic knowledge question-  answering systems in university settings. To achieve this goal, this paper proposes an efficient fine-tuning approach for large language model with low-rank encoding in the domain of fundamental knowledge. This method uses low-rank encoding to minimize computational costs during the training and prediction stages of large language models. It enhances research and analysis in our university’s basic knowledge question-answering field by employing the generative capabilities of these models, improving the quality, accuracy, and response speed of everyday queries. By freezing the weights of the pre-trained model and integrating university-specific knowledge into the Transformer architecture, along with a question-answering optimization module, this approach preserves generative abilities and achieves superior performance and accuracy in the university knowledge domain while reducing trainable parameters for downstream tasks.

Key words: generative language model, fundamental knowledge question-answering, large language model, Transformer, freezing model weights