[1] ZHANG Y J, ZHAO P, MA L, et al. An unbiased risk estimator for learning with augmented classes[C]//Advances in Neural Information Processing Systems 33, 2020: 10247-10258.
[2] XU M, GUO L Z. Learning from group supervision: the impact of supervision deficiency on multi-label learning[J]. Science China Information Sciences, 2021, 64(3): 130101.
[3] MCCLOSKEY M, COHEN N J. Catastrophic interference in connectionist networks: the sequential learning problem[M]//Psychology of learning and motivation. Elsevier, 1989: 109-165.
[4] WANG Z R, MEHTA S V, POCZOS B, et al. Efficient meta lifelong-learning with limited memory[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2020: 535-548.
[5] LOPEZ-PAZ D, RANZATO M A. Gradient episodic memory for continual learning[C]//Advances in Neural Information Processing Systems 30, Long Beach, Dec 4-9, 2017: 6467-6476.
[6] KIRKPATRICK J, PASCANU R, RABINOWITZ N, et al. Overcoming catastrophic forgetting in neural networks[J]. Proceedings of the National Academy of Sciences, 2017, 114(13): 3521-3526.
[7] YAN S, XIE J, HE X. DER: dynamically expandable representation for class incremental learning[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 3013-3022.
[8] WU Y, CHEN Y, WANG L, et al. Large scale incremental learning[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 374-382.
[9] YIN W P, LI J, XIONG C M. ConTinTin: continual learning from task instructions[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2022: 3062-3072.
[10] WANG J, DONG D, SHOU L, et al. Effective continual learning for text classification with lightweight snapshots[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2023, 37(8): 10122-10130.
[11] CHEN D, SONG S, YU Q, et al. Grimoire is all you need for enhancing large language models[EB/OL]. [2024-07-22]. https://arxiv.org/abs/2401.03385.
[12] LIU P F, QIU X P, HUANG X J. Recurrent neural network for text classification with multi-task learning[C]//Proceedings of the 25th International Joint Conference on Artificial Intelligence. Menlo Park: AAAI, 2016: 2873-2879.
[13] CONNEAU A, SCHWENK H, BARRAULT L, et al. Very deep convolutional networks for text classification[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics.Stroudsburg: ACL, 2017: 1107-1116.
[14] YAO L, MAO C S, LUO Y. Graph convolutional networks for text classification[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(1): 905.
[15] JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of tricks for efficient text classification[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg: ACL, 2017: 427-431.
[16] SCHICK T, SCHÜTZE H. Exploiting cloze-questions for few-shot text classification and natural language inference[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics.Stroudsburg: ACL, 2021: 255-269.
[17] HAN X, ZHAO W, DING N, et al. PTR: prompt tuning with rules for text classification[J]. AI Open, 2022, 3: 182-192.
[18] SHI W J, MICHAEL J, GURURANGAN S, et al. Nearest neighbor zero-shot inference[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2022: 3254-3265.
[19] DE MASSON D’ AUTUME C, RUDER S, KONG L, et al. Episodic memory in lifelong language learning[C]//Advances in Neural Information Processing Systems 32, Vancouver,Dec 8-14, 2019: 13122-13131.
[20] ZHOU D W, YE H J, ZHAN D C. Learning placeholders for open-set recognition[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 4399-4408.
[21] ZHAO B, XIAO X, GAN G, et al. Maintaining discrimination and fairness in class incremental learning[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 13205-13214.
[22] GPT-4 turbo preview: exploring the 128k context window[EB/OL]. [2024-07-22]. https://povio.com/blog/gpt-4-turbo-preview-exploring-the-128k-context-window/.
[23] PAN Z, WU Q, JIANG H, et al. LLMLingua-2: data distillation for efficient and faithful task-agnostic prompt compression[EB/OL]. [2024-07-22]. http://arxiv.org/abs/2403.12968.
[24] RUBIN O, HERZIG J, BERANT J. Learning to retrieve prompts for in-context learning[C]//Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.Stroudsburg: ACL, 2022: 2655-2671.
[25] ZHOU Y, MURESANU A I, HAN Z, et al. Large language models are human-level prompt engineers[EB/OL]. [2024-08-13]. http://arxiv.org/abs/2211.01910.
[26] KOJIMA T, GU S S, REID M, et al. Large language models are zero-shot reasoners[EB/OL]. [2024-08-13]. http://arxiv.org/abs/2205.11916.
[27] GLM T, ZENG A, XU B, et al. ChatGLM: a family of large language models from GLM-130B to GLM-4 all tools[EB/OL]. [2024-07-22]. http://arxiv.org/abs/2406.12793.
[28] Papers with Code-THUCNews dataset[EB/OL]. [2024-07-22]. https://paperswithcode.com/dataset/thucnews.
[29] LI Z, HOIEM D. Learning without forgetting[J]. IEEE Trans-actions on Pattern Analysis and Machine Intelligence, 2018, 40(12): 2935-2947.
[30] CHAUDHRY A, DOKANIA P K, AJANTHAN T, et al. Riemannian walk for incremental learning: understanding forgetting and intransigence[C]//Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 556-572.
[31] ALJUNDI R, BABILONI F, ELHOSEINY M, et al. Memory aware synapses: learning what (not) to forget[C]//Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 144-161.
[32] HOULSBY N, GIURGIU A, JASTRZEBSKI S, et al. Parameter-efficient transfer learning for NLP[C]//Proceedings of the 36th International Conference on Machine Learning, Long Beach, Jun 9-15, 2019: 2790-2799.
[33] PFEIFFER J, VULIĆ I, GUREVYCH I, et al. MAD-X: an adapter-based framework for multi-task cross-lingual transfer[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2020: 7654-7673.
[34] PFEIFFER J, KAMATH A, RÜCKLÉ A, et al. AdapterFusion: non-destructive task composition for transfer learning[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics.Stroudsburg: ACL, 2021: 487-503. |