Journal of Frontiers of Computer Science and Technology ›› 2024, Vol. 18 ›› Issue (12): 3260-3271.DOI: 10.3778/j.issn.1673-9418.2406047

• Artificial Intelligence·Pattern Recognition • Previous Articles     Next Articles

Mixture of Expert Large Language Model for Legal Case Element Recognition

YIN Hua, WU Zihao, LIU Tingting, ZHANG Jiajia, GAO Ziqian   

  1. 1. School of Digital Economics, Guangdong University of Finance & Economics, Guangzhou 510320, China
    2. School of Informatics, Guangdong University of Finance & Economics, Guangzhou 510320, China
  • Online:2024-12-01 Published:2024-11-29

法律案件要素识别混合专家大模型

尹华,吴梓浩,柳婷婷,张佳佳,高子千   

  1. 1. 广东财经大学 数字经济学院,广州 510320
    2. 广东财经大学 信息学院,广州 510320

Abstract: The intelligent judicial decision-making is gradually aligning with the logic of legal adjudication. Case element recognition is a fundamental task proposed in recent years. Compared with earlier methods based on deep learning and machine reading comprehension, the generative element recognition approach using large language models (LLM) holds greater potential for complex reasoning. However, the current performance of judicial LLM on these fundamental tasks remains suboptimal. This paper introduces a conversational mixture of expert element recognition LLM. The proposed model in this paper first designs specific prompts tailored to the characteristics of cases for the ChatGLM3-6B-base model. The LLM is then fine-tuned with full parameters to acquire basic element recognition capabilities, with its weights shared among subsequent hybrid experts to reduce learning costs. To address different case types and label imbalance scenarios, case-specific DoRA experts and label-specific DoRA experts are integrated into the LLM’s attention layer, enhancing the model’s ability to differentiate between tasks. A learnable gating mechanism is also designed to facilitate the selection of label experts. The proposed model is tested on the CAIL2019 dataset and a desensitized theft case element recognition dataset from a certain province,  nine benchmark models across three types of methods are compared, and ablation experiments are conducted. Experimental results show that the proposed model’s overall performance, measured by the F1 score, exceeds the best-performance model by 5.9 percentage points. On the label-imbalanced CAIL2019 dataset, the label expert effectively mitigates the impact of extreme data imbalance. Additionally, without repeated full-parameter fine-tuning, the basic model trained on CAIL2019 achieves optimal results in theft cases of a certain province after lightweight fine-tuning by case and label experts, demonstrating the model’s scalability.

Key words: legal case element recognition, large language model, mixture of parameter-efficiency expert, prompt

摘要: 智能司法判决正向符合法律判案逻辑的方向转变。案件要素识别是近年来提出的一项基础任务。相比于前期的基于深度学习和机器阅读理解的识别方法,采用大模型的生成式要素识别方法具有进行复杂推理的潜力。但是,目前司法大模型在这类基础任务上的效果不佳。提出了一种对话式混合专家要素识别大模型。该模型针对案件特点设计了特定的Prompt,供ChatGLM3-6B-base大模型学习;通过全参微调该大模型获得基础要素识别能力,其权重供后续混合专家共享,降低大模型学习成本;针对不同案件类型场景和标签不平衡场景,在大模型的注意力层引入案件DoRA专家和标签DoRA专家模块,提高模型对任务的区分度;设计可学习门控实现标签专家选择。在CAIL2019和某省脱敏盗窃案件要素识别数据集上,对比了三类方法的九个基准模型,并进行模型消融实验。实验结果显示,提出的模型综合性能F1值高于最优模型性能5.9个百分点;在标签不平衡的CAIL2019数据集上,标签专家一定程度上能够减缓数据极度不平衡给模型带来的影响;同时,CAIL2019上训练的模型不再需要全参微调,通过案件专家和标签专家轻量级微调后,在某省盗窃案件中取得最佳效果,证明模型具有易扩展性。

关键词: 案件要素识别, 大模型, 混合参数高效专家, 提示词