[1] ZHAO W X, ZHOU K, LI J, et al. A survey of large language models[EB/OL]. [2024-08-18]. https://arxiv.org/abs/2303. 18223.
[2] RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving language understanding by generative pre-training[EB/OL]. [2024-08-18]. https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf.
[3] RADFORD A, WU J, CHILD R, et al. Language models are unsupervised multitask learners[J]. OpenAI Blog, 2019, 1(8): 9.
[4] BROWN T, MANN B, RYDER N, et al. Language models are few-shot learners[C]//Advances in Neural Information Processing Systems 33, 2020: 1877-1901.
[5] SHAHNAZARYAN L, BELOUCIF M. Defining boundaries: the impact of domain specification on cross-language and cross-domain transfer in machine translation[EB/OL]. [2024- 08-18]. https://arxiv.org/abs/2408.11926.
[6] MAN Z, HUANG Z, ZHANG Y, et al. WDSRL: multi-domain neural machine translation with word-level domain-sensitive representation learning[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023, 32: 577-590.
[7] KOEHN P, KNOWLES R. Six challenges for neural machine translation[EB/OL]. [2024-08-18]. https://arxiv.org/abs/1706. 03872.
[8] RAFFEL C, SHAZEER N, ROBERTS A, et al. Exploring the limits of transfer learning with a unified text-to-text transformer[J]. Journal of Machine Learning Research, 2020, 21(1): 5485-5551.
[9] GURURANGAN S, MARASOVI? A, SWAYAMDIPTA S, et al. Don’t stop pretraining: adapt language models to domains and tasks[EB/OL]. [2024-08-18]. https://arxiv.org/abs/2004.10964.
[10] 李亚超, 熊德意, 张民. 神经机器翻译综述[J]. 计算机学报, 2018, 41(12): 2734-2755.
LI Y C, XIONG D Y, ZHANG M. A survey of neural machine translation[J]. Chinese Journal of Computers, 2018, 41(12): 2734-2755.
[11] 袁小于. 基于规则的机器翻译技术综述[J]. 重庆文理学院学报(自然科学版), 2011(3): 56-59.
YUAN X Y. Rule-based machine translation technology review[J]. Journal of Chongqing University of Arts and Sciences (Natural Science Edition), 2011(3): 56-59.
[12] 刘占一, 李生, 刘挺, 等. 利用统计搭配模型改进基于实例的机器翻译[J]. 软件学报, 2012, 23(6): 1472-1485.
LIU Z Y, LI S, LIU T, et al. Improving example-based machine translation with statistical collocation model[J]. Journal of Software, 2012, 23(6): 1472-1485.
[13] ZAREMBA W, SUTSKEVER I, VINYALS O. Recurrent neural network regularization[EB/OL]. [2024-08-18]. https:// arxiv.org/abs/1409.2329.
[14] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
[15] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems 30, 2017: 5998-6008.
[16] YANG S H, WANG Y X, CHU X W. A survey of deep learning techniques for neural machine translation[EB/OL]. [2024-08-18]. https://arxiv.org/abs/2002.07526.
[17] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[EB/OL]. [2024-08-18]. https://arxiv.org/abs/1810.04805.
[18] LEWIS M, LIU Y H, GOYAL N, et al. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension[EB/OL]. [2024-09-14]. https://arxiv.org/abs/1910.13461.
[19] UNANUE I J, PARNELL J, PICCARDI M. BERTTune: fine-tuning neural machine translation with BERTScore[EB/OL]. [2024-09-14]. https://arxiv.org/abs/2106.02208.
[20] ZHU J H, XIA Y C, WU L J, et al. Incorporating BERT into neural machine translation[EB/OL]. [2024-09-14]. https://arxiv.org/abs/2002.06823.
[21] XU H, VAN DURME B, MURRAY K. BERT, mBERT, or BiBERT? A study on contextualized embeddings for neural machine translation[EB/OL]. [2024-09-14]. https://arxiv.org/abs/2109.04588.
[22] LIU Y H, GU J T, GOYAL N, et al. Multilingual denoising pre-training for neural machine translation[EB/OL]. [2024-09-14]. https://arxiv.org/abs/2001.08210.
[23] VERMA N, MURRAY K, DUH K. Strategies for adapting multilingual pre-training for domain-specific machine translation[C]//Proceedings of the 15th Biennial Conference of the Association for Machine Translation in the Americas,2022: 31-44.
[24] YE J J, CHEN X T, XU N, et al. A comprehensive capability analysis of GPT-3 and GPT-3.5 series models[EB/OL]. [2024-09-14]. https://arxiv.org/abs/2303.10420.
[25] ACHIAM J, ADLER S, AGARWAL S, et al. Gpt-4 technical report[EB/OL]. [2024-09-14]. https://arxiv.org/abs/2303.08774.
[26] LIALIN V, DESHPANDE V, RUMSHISKY A. Scaling down to scale up: a guide to parameter-efficient fine-tuning[EB/OL]. [2024-09-14]. https://arxiv.org/abs/2303.15647.
[27] XEZONAKI D, KHALIL T, STAP D, et al. Improving domain robustness in neural machine translation with fused topic knowledge embeddings[C]//Proceedings of the 2023 Machine Translation Summit XIX, Vol. 1: Research Track, 2023: 209-221.
[28] MAN Z B, ZHANG Y J, CHEN Y M, et al. Exploring domain-shared and domain-specific knowledge in multi-domain neural machine translation[C]//Proceedings of the 2023 Machine Translation Summit XIX, Vol. 1: Research Track, 2023: 99-110.
[29] SAUNDERS D, DENEEFE S. Domain adapted machine translation: what does catastrophic forgetting forget and why?[C]//Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2024: 12660-12671.
[30] THOMPSON B, GWINNUP J, KHAYRALLAH H, et al. Overcoming catastrophic forgetting during domain adaptation of neural machine translation[C]//Proceedings of the 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Stroudsburg: ACL, 2019: 2062-2068.
[31] ZHANG X, RAJABI N, DUH K, et al. Machine translation with large language models: prompting, few-shot learning, and fine-tuning with QLoRA[C]//Proceedings of the 8th Conference on Machine Translation. Stroudsburg: ACL, 2023: 468-481.
[32] LIU Y H, OTT M, GOYAL N, et al. RoBERTa: a robustly optimized BERT pretraining approach[EB/OL]. [2024-09-25]. https://arxiv.org/abs/1907.11692.
[33] HOULSBY N, GIURGIU A, JASTRZEBSKI S, et al. Parameter-efficient transfer learning for NLP[C]//Proceedings of the 36th International Conference on Machine Learning, 2019: 2790-2799.
[34] BEN ZAKEN E, RAVFOGEL S, GOLDBERG Y. BitFit: simple parameter-efficient fine-tuning for transformer-based masked language-models[EB/OL]. [2024-09-25]. https://arxiv.org/abs/2106.10199.
[35] AGHAJANYAN A, ZETTLEMOYER L, GUPTA S. Intrinsic dimensionality explains the effectiveness of language model fine-tuning[EB/OL]. [2024-09-25]. https://arxiv.org/abs/2012. 13255.
[36] HU J E, SHEN Y L, WALLIS P, et al. LoRA: low-rank adaptation of large language models[EB/OL]. [2024-09-25]. https://arxiv.org/abs/2106.09685.
[37] ZHENG J, HONG H, WANG X, et al. Fine-tuning large language models for domain-specific machine translation[EB/OL]. [2024-09-25]. https://arxiv.org/abs/2402.15061.
[38] ZHANG Q R, CHEN M S, BUKHARIN A, et al. AdaLoRA: adaptive budget allocation for parameter-efficient fine-tuning[EB/OL]. [2024-09-25]. https://arxiv.org/abs/2303.10512.
[39] DETTMERS T, PAGNONI A, HOLTZMAN A, et al. QLoRA: efficient finetuning of quantized LLMs[EB/OL]. [2024-09-25]. https://arxiv.org/pdf/2305.14314.
[40] 张钦彤, 王昱超, 王鹤羲, 等. 大语言模型微调技术的研究综述[J]. 计算机工程与应用, 2024, 60(17): 17-33.
ZHANG Q T, WANG Y C, WANG H X, et al. Comprehensive review of large language model fine-tuning[J]. Computer Engineering and Applications, 2024, 60(17): 17-33.
[41] LAI W, CHRONOPOULOU A, FRASER A. m4Adapter: multilingual multi-domain adaptation for machine translation with a meta-adapter[EB/OL]. [2024-09-25]. https://arxiv.org/abs/2210.11912.
[42] WU Z L, LUO Y C, WEI D M, et al. HW-TSC’s submission to the CCMT 2024 machine translation tasks[EB/OL]. [2024- 09-25]. https://arxiv.org/abs/2409.14842.
[43] SCHICK T, SCHüTZE H. Exploiting cloze-questions for few-shot text classification and natural language inference[EB/OL]. [2024-09-25]. https://arxiv.org/abs/2001.07676.
[44] DONG Q X, LI L, DAI D M, et al. A survey on in-context learning[EB/OL]. [2024-09-25]. https://arxiv.org/abs/2301. 00234.
[45] ZHU S, CUI M, XIONG D. Towards robust in-context learning for machine translation with large language models[C]//Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, 2024: 16619-16629.
[46] ZHANG B, HADDOW B, BIRCH A, et al. Prompting large language model for machine translation[C]//Proceedings of the 40th International Conference on Machine Learning, 2023: 41092-41110.
[47] WEI J, BOSMA M, ZHAO V Y, et al. Finetuned language models are zero-shot learners[EB/OL]. [2024-09-25]. https:// arxiv.org/abs/2109.01652.
[48] RIOS M. Instruction-tuned large language models for machine translation in the medical domain[EB/OL]. [2024-09-25]. https://arxiv.org/abs/2408.16440.
[49] WEI J, WANG X Z, SCHUURMANS D, et al. Chain-of-thought prompting elicits reasoning in large language models[C]//Advances in Neural Information Processing Systems 35, 2022: 24824-24837.
[50] KOJIMA T, GU S S, REID M, et al. Large language models are zero-shot reasoners[C]//Advances in Neural Information Processing Systems 35, 2022: 22199-22213.
[51] HU T, ZHANG P, YANG B, et al. Large language model for multi-domain translation: benchmarking and domain CoT fine-tuning[EB/OL]. [2024-11-10]. https://arxiv.org/abs/ 2410.02631. |