计算机科学与探索 ›› 2024, Vol. 18 ›› Issue (9): 2293-2325.DOI: 10.3778/j.issn.1673-9418.2402023
许志伟,李海龙,李博,李涛,王嘉泰,谢学说,董泽辉
出版日期:
2024-09-01
发布日期:
2024-09-01
XU Zhiwei, LI Hailong, LI Bo, LI Tao, WANG Jiatai, XIE Xueshuo, DONG Zehui
Online:
2024-09-01
Published:
2024-09-01
摘要: 人工智能生成内容(AIGC)模型因出色的内容生成能力,在全球范围内引起了广泛关注与应用。然而AIGC大模型的快速发展也带来了一系列隐患,例如模型生成结果的可解释性、公平性和安全隐私等问题。为了降低不可知风险及其危害,对AIGC大模型进行全面测评变得越来越重要。学术界已经开启了AIGC大模型测评研究,旨在有效应对相关挑战,避免潜在的风险。对AIGC大模型测评研究进行了回顾,并对其进行了综述和分析。对模型测评过程进行概述,内容涵盖模型测评前准备和相应的测评指标,并系统性地整理了现有测评基准。讨论了AIGC大模型在金融、政治和医疗领域的代表性应用及其存在的问题。通过可解释性、公平性、鲁棒性、安全性和隐私性等不同角度深入研究测评方法,对AIGC大模型测评需要关注的新问题进行解构,提出大模型测评新挑战的应对策略。最后探讨了AIGC大模型测评未来面临的挑战,并展望了其发展方向。
许志伟, 李海龙, 李博, 李涛, 王嘉泰, 谢学说, 董泽辉. AIGC大模型测评综述:使能技术、安全隐患和应对[J]. 计算机科学与探索, 2024, 18(9): 2293-2325.
XU Zhiwei, LI Hailong, LI Bo, LI Tao, WANG Jiatai, XIE Xueshuo, DONG Zehui. Survey of AIGC Large Model Evaluation: Enabling Technologies, Vulnerabilities and Mitigation[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(9): 2293-2325.
[1] HO J, JAIN A, ABBEEL P. Denoising diffusion probabilistic models[C]//Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020: 6840-6851. [2] KARRAS T, LAINE S, AILA T. A style-based generator architecture for generative adversarial networks[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Pisca-taway: IEEE, 2019: 4401-4410. [3] ZHANG C, ZHANG C, ZHENG S, et al. A complete survey on generative AI (AIGC): is ChatGPT from GPT-4 to GPT-5 all you need? [EB/OL]. [2024-01-10]. https://arxiv.org/abs/2303.11717. [4] OPENAI. ChatGPT: optimizing language models for dialogue[EB/OL]. (2022-11-30) [2024-01-17]. https://openai.com/blog/chatgpt. [5] HUANG F, KWAK H, AN J. Is ChatGPT better than human annotators? Potential and limitations of ChatGPT in explaining implicit hate speech[C]//Companion Proceedings of the ACM Web Conference 2023, Austin, Apr 30-May 4, 2023. New York: ACM, 2023: 294-297. [6] LI J, TANG T, ZHAO W X, et al. Pre-trained language models for text generation: a survey[J]. ACM Computing Surveys, 2024, 56(9): 230. [7] FLORIDI L, CHIRIATTI M. GPT-3: its nature, scope, limits, and consequences[J]. Minds and Machines, 2020, 30: 681-694. [8] OUYANG L, WU J, JIANG X, et al. Training language models to follow instructions with human feedback[C]//Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, New Orleans, Nov 28-Dec 9, 2022: 27730-27744. [9] QIN C, ZHANG A, ZHANG Z, et al. Is ChatGPT a general-purpose natural language processing task solver?[C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, Dec 6-10, 2023. Stroudsburg: ACL, 2023: 1339-1384. [10] RAO H, LEUNG C, MIAO C. Can ChatGPT assess human personalities? A general evaluation framework[C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, Dec 6-10, 2023. Stroudsburg: ACL, 2023: 1184-1194. [11] YANG X, LI Y, ZHANG X, et al. Exploring the limits of ChatGPT for query or aspect-based text summarization[EB/OL]. [2024-01-12]. https://arxiv.org/abs/2302.08081. [12] ZUCCON G, KOOPMAN B. Dr ChatGPT tell me what I want to hear: how prompt knowledge impacts health ans-wer correctness[C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, Dec 6-10, 2023. Stroudsburg: ACL, 2023: 15012-15022. [13] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, Jun 2-7, 2019. Stroudsburg: ACL, 2019: 4171-4186. [14] CHOWDHERY A, NARANG S, DEVLIN J, et al. PALM: scaling language modeling with pathways[J]. Journal of Machine Learning Research, 2023, 24: 240. [15] ZHANG S, ROLLER S, GOYAL N, et al. OPT: open pre-trained transformer language models[EB/OL]. [2024-01-10]. https://arxiv.org/abs/2205.01068. [16] SCAO T L, FAN A, AKIKI C, et al. BLOOM: a 176B-parameter open-access multilingual language model[EB/OL]. [2024-01-10]. https://arxiv.org/abs/2211.05100. [17] META AI. Introducing LLaMA: a foundational, 65-billion-parameter large language model[EB/OL]. (2023-01-15) [2024-02-17]. https://ai.facebook.com/blog/large-language-model-llama-meta-ai. [18] META AI. LLaMA2[EB/OL]. (2023-07-19) [2024-02-17]. https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/. [19] ROMBACH R, BLATTMANN A, LORENZ D, et al. High-resolution image synthesis with latent diffusion models[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, Jun 18-24, 2022. Piscataway: IEEE, 2022: 10684-10695. [20] RUSKOV M. Grimm in wonderland: prompt engineering with midjourney to illustrate fairytales[EB/OL]. [2024-01-15]. https://arxiv.org/abs/2302.08961. [21] OPENAI R. GPT-4 technical report[EB/OL]. [2024-01-15]. https://arxiv.org/abs/2303.08774. [22] DU Z, QIAN Y, LIU X, et al. GLM: general language model pretraining with autoregressive blank infilling[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, May 22-27, 2022. Stroudsburg: ACL, 2022: 320-335. [23] Tsinghua University. ChatGLM2[EB/OL]. [2024-02-17]. https://github.com/THUDM/ChatGLM2-6B/. [24] QIAN C, HAN C, FUNG Y R, et al. CREATOR: disentang-ling abstract and concrete reasonings of large language models through tool creation[EB/OL]. [2024-01-20]. https://arxiv.org/abs/2305.14318. [25] BAIDU. 文心一言[EB/OL]. (2023-03-16) [2024-02-17]. https://yiyan.baidu.com/. [26] ALIYUN. 通义大模型[EB/OL]. (2023-09-16) [2024-02-17]. https://tongyi.aliyun.com/. [27] Fudan University. MOSS[EB/OL]. (2023-04-21) [2024-02-17]. https://moss.fudan.edu.cn/. [28] MindSpore[EB/OL]. (2023-06-16) [2024-02-17]. https://www.mindspore.cn/largeModel. [29] 华为云盘古大模型[EB/OL]. (2023-07-16) [2024-02-17]. https://www.huaweicloud.com/special/pangu-ai.html/.Huawei Yun. Pangu large model[EB/OL]. (2023-07-16) [2024-02-17]. https://www.huaweicloud.com/special/pangu-ai.html/. [30] AZARIA A, MITCHELL T. The internal state of an LLM knows when it??s lying[C]//Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, Dec 2023. Stroudsburg: ACL, 2023: 967-976. [31] CHIANG C H, LEE H. Can large language models be an alternative to human evaluations?[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, Toronto, Jul 9-14, 2023. Stroudsburg: ACL, 2023. [32] GAO M, RUAN J, SUN R, et al. Human-like summarization evaluation with ChatGPT[EB/OL]. [2024-01-10]. https://arxiv.org/abs/2304.02554. [33] LIN Y T, CHEN Y N. LLM-Eval: unified multi-dimensional automatic evaluation for open-domain conversations with large language models[C]//Proceedings of the 5th Workshop on NLP for Conversational AI, Toronto, Jul 2023. Stroudsburg: ACL, 2023: 47-58. [34] LIU J, XIA C S, WANG Y, et al. Is your code generated by ChatGPT really correct? Rigorous evaluation of large language models for code generation[C]//Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, New Orleans, Dec 10- 16, 2023. [35] LIU Y, ITER D, XU Y, et al. G-Eval: NLG evaluation using GPT-4 with better human alignment[C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, Dec 6-10, 2023. Stroudsburg: ACL, 2023: 2511-2522. [36] WANG J, LIANG Y, MENG F, et al. Is ChatGPT a good NLG evaluator? A preliminary study[C]//Proceedings of the 4th New Frontiers in Summarization Workshop, Singapore, Dec 2023. Stroudsburg: ACL, 2023: 1-11. [37] HENDRYCKS D, BURNS C, BASART S, et al. Measuring massive multitask language understanding[C]//Proceedings of the 9th International Conference on Learning Representations, 2020. [38] ZHANG X, LI C, ZONG Y, et al. Evaluating the performance of large language models on GAOKAO benchmark[EB/OL]. [2024-01-10]. https://arxiv.org/abs/2305.12474. [39] HUANG Y, BAI Y, ZHU Z, et al. C-EVAL: a multi-level multi-discipline Chinese evaluation suite for foundation models[C]//Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, New Orleans, Dec 10-16, 2023: 62991-63010. [40] ZHONG W, CUI R, GUO Y, et al. AGIEval: a human-centric benchmark for evaluating foundation models[C]//Findings of the Association for Computational Linguistics: NAACL 2024, Mexico City, Jun 2024. Stroudsburg: ACL, 2024: 2299-2314. [41] LI H, ZHANG Y, KOTO F, et al. CMMLU: measuring massive multitask language understanding in Chinese[EB/OL]. [2024-01-20]. https://arxiv.org/abs/2306.09212. [42] ZHANG W, ALJUNIED S M, GAO C, et al. M3Exam: a multilingual, multimodal, multilevel benchmark for examining large language models[C]//Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, New Orleans, Dec 10-16, 2023: 5484-5505. [43] SRIVASTAVA A, RASTOG A, RAO A, et al. Beyond the imitation game: quantifying and extrapolating the capabilities of language models[EB/OL]. [2024-01-18]. https://arxiv.org/abs/2206.04615. [44] LIANG P, BOMMASANI R, LEE T, et al. Holistic evaluation of language models[EB/OL]. [2024-01-18]. https://arxiv.org/abs/2211.09110. [45] CHANG Y, WANG X, WANG J, et al. A survey on evaluation of large language models[J]. ACM Transactions on Intelligent Systems and Technology, 2024, 15(3): 1-45. [46] CARVALHO D V, PEREIRA E M, CARDOSO J S. Machine learning interpretability: a survey on methods and metrics[J]. Electronics, 2019, 8(8): 832. [47] RüPING S. Learning interpretable models[D]. Dortmund: University Dortmund, 2006. [48] ZHOU J, KHAWAJA M A, LI Z, et al. Making machine learning useable by revealing internal states update-a transparent approach[J]. International Journal of Computational Science and Engineering, 2016, 13(4): 378-389. [49] CARLINI N, LIU C, ERLINGSSON ú, et al. The secret sharer: evaluating and testing unintended memorization in neural networks[C]//Proceedings of the 28th USENIX Security Symposium, Santa Clara, Aug 14-16, 2019. Berkeley: USENIX Association, 2019: 267-284. [50] FREDRIKSON M, JHA S, RISTENPART T. Model inversion attacks that exploit confidence information and basic countermeasures[C]//Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, Denver, Oct 12-16, 2015. New York: ACM, 2015: 1322-1333. [51] GANJU K, WANG Q, YANG W, et al. Property inference attacks on fully connected neural networks using permutation invariant representations[C]//Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, Toronto, Oct 15-19, 2018. New York: ACM, 2018: 619-633. [52] SALEM A, BHATTACHARYA A, BACKES M, et al. Updates-Leak: data set inference and reconstruction attacks in online learning[C]//Proceedings of the 29th USENIX Security Symposium. Berkeley: USENIX Association, 2020: 1291-1308. [53] SHOKRI R, STRONATI M, SONG C, et al. Membership inference attacks against machine learning models[C]//Proceedings of the 2017 IEEE Symposium on Security and Privacy, San Jose, May 22-26, 2017. Piscataway: IEEE, 2017: 3-18. [54] SONG C, RISTENPART T, SHMATIKOV V. Machine learning models that remember too much[C]//Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, Oct 30-Nov 3, 2017. New York: ACM, 2017: 587-601. [55] WANG J, HU X, HOU W, et al. On the robustness of ChatGPT: an adversarial and out-of-distribution perspective[J]. IEEE Data Engineering Bulletin, 2024, 47(1): 48-62. [56] ZHU K, WANG J, ZHOU J, et al. PromptBench: towards evaluating the robustness of large language models on adversarial prompts[EB/OL]. [2024-01-18]. https://arxiv.org/abs/2306.04528. [57] WILLIG M, ZECEVIC M, DHAMI D S, et al. Causal parrots: large language models may talk causality but are not causal[EB/OL]. [2024-01-18]. https://arxiv.org/abs/2308.13067. [58] ZHOU K, ZHU Y, CHEN Z, et al. Don??t make your LLM an evaluation benchmark cheater[EB/OL]. [2024-01-18]. https://arxiv.org/abs/2311.01964. [59] ZHU K, CHEN J, WANG J, et al. DyVal: graph-informed dynamic evaluation of large language models[EB/OL]. [2024-01-18]. https://arxiv.org/abs/2309.17167. [60] ZHU K, ZHAO Q, CHEN H, et al. PromptBench: a unified library for evaluation of large language models[EB/OL]. [2024-01-18]. https://arxiv.org/abs/2312.07910. [61] ZHENG L, SHENG Y, CHIANG W, et al. Chatbot Arena: benchmarking LLMs in the wild with ELO ratings[EB/OL]. (2023-05-03) [2024-02-17]. https://lmsys.org/blog/2023-05-03-arena/. [62] ZHENG L, CHIANG W L, SHENG Y, et al. Judging LLM-as-a-judge with MT-bench and Chatbot Arena[C]//Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing System 2023, New Orleans, Dec 10-16, 2023: 46595-46623. [63] FU C, CHEN P, SHEN Y, et al. MME: a comprehensive evaluation benchmark for multimodal large language models[EB/OL]. [2024-01-18]. https://arxiv.org/abs/2306.13394. [64] AN C, GONG S, ZHONG M, et al. L-Eval: instituting standardized evaluation for long context language models[EB/OL]. [2024-01-18]. https://arxiv.org/abs/2307.11088. [65] YU J, WANG X, TU S, et al. KoLA: carefully benchmarking world knowledge of large language models[EB/OL]. [2024-01-20]. https://arxiv.org/abs/2306.09296. [66] KIELA D, BARTOLO M, NIE Y, et al. DynaBench: rethinking benchmarking in NLP[C]//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: ACL, 2021: 4110-4124. [67] ZHOU Y, MURESANU A I, HAN Z, et al. Large language models are human-level prompt engineers[C]//Proceedings of the 11th International Conference on Learning Representations, Kigali, May 1-5, 2023. [68] DUBOIS Y, GALAMBOSI B, LIANG P, et al. Length-controlled AlpacaEval: a simple way to debias automatic evaluators[EB/OL]. [2024-01-20]. https://arxiv.org/abs/2404. 04475. [69] WANG Y, YU Z, ZENG Z, et al. PandaLM: an automatic evaluation benchmark for LLM instruction tuning optimization[EB/OL]. [2024-01-20]. https://arxiv.org/abs/2306.05087. [70] CHOI M, PEI J, KUMAR S, et al. Do LLMs understand social knowledge? Evaluating the sociability of large language models with SocKET benchmark[C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, Dec 6-10, 2023. Stroudsburg: ACL, 2023: 11370-11403. [71] HENDRYCKS D, BURNS C, KADAVATH S, et al. Measuring mathematical problem solving with the MATH data-set[EB/OL]. [2024-01-20]. https://arxiv.org/abs/2103.03874. [72] HENDRYCKS D, BASART S, KADAVATH S, et al. Measuring coding challenge competence with apps[EB/OL]. [2024-01-20]. https://arxiv.org/abs/2105.09938. [73] HUGGINGFACE. Open-source large language models leaderboard[EB/OL]. (2023-01-01) [2024-02-17]. https://huggingface.co/spaces/HuggingFaceH4/open-llm-leaderboard. [74] 超对称(北京)科技有限公司. BBT CFLEB[EB/OL]. (2023-07-28) [2024-02-17]. https://bbt.ssymmetry.com/evaluation.html. [75] ZHANG L, CAI W, LIU Z, et al. FineVal: a Chinese financial domain knowledge evaluation benchmark for large language models[EB/OL]. [2024-01-20]. https://arxiv.org/abs/2308.09975. [76] SINGHAL K, AZIZI S, TU T, et al. Large language models encode clinical knowledge[J]. Nature, 2023, 620: 172-180. [77] KE P, WEN B, FENG Z, et al. CritiqueLLM: scaling LLM-as-critic for effective and explainable evaluation of large language model generation[EB/OL]. [2024-01-20]. https://arxiv.org/abs/2311.18702. [78] YANG K, ZHANG T, KUANG Z, et al. MentaLLaMA: interpretable mental health analysis on social media with large language models[C]//Proceedings of the ACM Web Conference 2024, Singapore, May 13-17, 2024. New York: ACM, 2024: 4489-4500. [79] WANG B, XU C, WANG S, et al. Adversarial GLUE: a multi-task benchmark for robustness evaluation of language models[EB/OL]. [2024-01-20]. https://arxiv.org/abs/2111.02840. [80] MEI A, LEVY S, WANG W Y. ASSERT: automated safety scenario red teaming for evaluating the robustness of large language models[C]//Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, Dec 2023.Stroudsburg: ACL, 2023: 5831-5847. [81] XU G, LIU J, YAN M, et al. CValues: measuring the values of Chinese large language models from safety to responsibility[EB/OL]. [2024-01-20]. https://arxiv.org/abs/2307.09705. [82] LIN H, LUO Z, WANG B, et al. GOAT-Bench: safety insights to large multimodal models through meme-based social abuse[EB/OL]. [2024-01-20]. https://arxiv.org/abs/2401.01523. [83] XU L, ZHAO K, ZHU L, et al. SC-Safety: a multi-round open-ended question adversarial safety benchmark for large language models in Chinese[EB/OL]. [2024-01-20]. https://arxiv.org/abs/2310.05818. [84] ZHANG Z, LEI L, WU L, et al. SafetyBench: evaluating the safety of large language models with multiple choice questions[EB/OL]. [2024-01-20]. https://arxiv.org/abs/2309.07045. [85] LEVY S, ALLAWAY E, SUBBIAH M, et al. SafeText: a benchmark for exploring physical safety in language models[EB/OL]. [2024-01-20]. https://arxiv.org/abs/2210.10045. [86] CHEN F, HAN M, ZHAO H, et al. X-LLM: bootstrapping advanced large language models by treating multi-modalities as foreign languages[EB/OL]. [2024-01-20]. https://arxiv.org/abs/2305.04160. [87] GAO P, HAN J, ZHANG R, et al. LLAMA-Adapter v2: parameter-efficient visual instruction model[EB/OL]. [2024-01-20]. https://arxiv.org/abs/2304.15010. [88] XU Z, SHEN Y, HUANG L. MultiInstruct: improving multi-modal zero-shot learning via instruction tuning[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Toronto, Jul 9-14, 2023. Stroudsburg: ACL, 2023: 11445-11465. [89] ZHANG R, HAN J, ZHOU A, et al. LLAMA-adapter: efficient fine-tuning of language models with zero-init attention[EB/OL]. [2024-01-20]. https://arxiv.org/abs/2303.16199. [90] ZHAO Z, GUO L, YUE T, et al. ChatBridge: bridging modalities with large language model as a language catalyst[EB/OL]. [2024-01-20]. https://arxiv.org/abs/2305.16103. [91] GONG T, LYU C, ZHANG S, et al. Multimodal-GPT: a vision and language model for dialogue with humans[EB/OL]. [2024-01-25]. https://arxiv.org/abs/2305.04790. [92] LI K C, HE Y, WANG Y, et al. VideocHat: Chat-centric video understanding[EB/OL]. [2024-01-25]. https://arxiv.org/abs/2305.06355. [93] LI L, YIN Y, LI S, et al. M3IT: a large-scale dataset towards multi-modal multilingual instruction tuning[EB/OL]. [2024-01-25]. https://arxiv.org/abs/2306.04387. [94] QIN Y, CAI Z, JIN D, et al. WebCPM: interactive Web search for Chinese long-form question answering[EB/OL]. [2024-01-25]. https://arxiv.org/abs/2305.06849. [95] DING N, CHEN Y, XU B, et al. Enhancing Chat language models by scaling high-quality instructional conversations[C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, Dec 6-10, 2023. Stroudsburg: ACL, 2023: 3029-3051. [96] SHI K, WANG X, YU J, et al. CStory: a Chinese large-scale news storyline dataset[C]//Proceedings of the 31st ACM International Conference on Information and Knowledge Management, Atlanta, Oct 17-21, 2022. New York: ACM, 2022: 4475-4479. [97] DU L, DING X, XIONG K, et al. e-CARE: a new dataset for exploring explainable causal reasoning[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, May 22-27, 2022. Stroudsburg: ACL, 2022: 432-446. [98] ALLEN Institute for AI. Dolma[EB/OL]. (2023-05-07) [2024-02-17]. https://blog.allenai.org/dolma-3-trillion-tokens-open-llm-corpus-9a0ff4b8da64. [99] LIU H, LI C, WU Q, et al. Visual instruction tuning[C]//Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, New Orleans, Dec 10-16, 2023. [100] ZHU D, CHEN J, SHEN X, et al. MinIGPT-4: enhancing vision-language understanding with advanced large language models[EB/OL]. [2024-01-25]. https://arxiv.org/abs/2304.10592. [101] YANG R, SONG L, LI Y, et al. GPT4Tools: teaching large language model to use tools via self-instruction[C]//Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, New Orleans, Dec 10-16, 2023: 71995-72007. [102] PI R, GAO J, DIAO S, et al. DetGPT: detect what you need via reasoning[C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, Dec 6-10, 2023. Stroudsburg: ACL, 2023: 14172-14189. [103] LIU P, YUAN W, FU J, et al. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing[J]. ACM Computing Surveys, 2023, 55(9): 195. [104] LI X, QIU X. MoT: pre-thinking and recalling enable ChatGPT to self-improve with memory-of-thoughts[EB/OL]. [2024-01-25]. https://arxiv.org/abs/2305.05181. [105] DOSHI-VELEZ F, KIM B. Towards a rigorous science of interpretable machine learning[EB/OL]. [2024-01-25]. https://arxiv.org/abs/1702.08608. [106] BAYAT V, PHELPS S, RYONO R, et al. A severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) prediction model from standard laboratory tests[J]. Clinical Infectious Diseases, 2021, 73(9): e2901-e2907. [107] GUIDOTTI R, MONREALE A, RUGGIERI S, et al. A survey of methods for explaining black box models[J]. ACM Computing Surveys, 2018, 51(5): 93. [108] SAMEK W, MONTAVON G, VEDALDI A, et al. Explainable AI: interpreting, explaining and visualizing deep learning[M]. Cham: Springer, 2019. [109] RAMESH A, DHARIWAL P, NICHOL A, et al. Hierarchical text-conditional image generation with CLIP latents[EB/OL]. (2022-04-13) [2024-02-17]. https://arxiv.org/abs/ 2204.06125. [110] REI R, STEWART C, FARINHA A C, et al. COMET: a neural framework for MT evaluation[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2020: 2685-2702. [111] PIRES T, SCHLINGER E, GARRETTE D. How multilingual is multilingual BERT?[C]//Proceedings of the 57th Conference of the Association for Computational Linguistics (Volume 1: Long Papers), Florence, Jul 28-Aug 2, 2019. Stroudsburg: ACL, 2019: 4996-5001. [112] SELLAM T, DAS D, PARIKH A P. BLEURT: learning robust metrics for text generation[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 7881-7892. [113] WU S, IRSOY O, LU S, et al. BloombergGPT: a large language model for finance[EB/OL]. [2024-01-25]. https://arxiv.org/abs/2303.17564. [114] ZHANG X, YANG Q, XU D. XuanYuan 2.0: a large Chinese financial chat model with hundreds of billions parameters[C]//Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, Birmingham, Oct 21-25, 2023. New York: ACM, 2023: 4435-4439. [115] YU Y M, HONG W H. Cornucopia[EB/OL]. (2023-03-21) [2024-02-17]. https://github.com/jerry1993-tech/Cornucopia- LLaMA-Fin-Chinese. [116] LU D, WU H, LIANG J, et al. BBT-Fin: comprehensive construction of Chinese financial domain pre-trained language model, corpus and benchmark[EB/OL]. [2024-01-25]. https://arxiv.org/abs/2302.09432. [117] LexiLaw[EB/OL]. (2023-07-18) [2024-02-17]. https://github.com/CSHaitao/LexiLaw. [118] NGUYEN H T. A brief report on LawGPT 1.0: a virtual legal assistant based on GPT-3[EB/OL]. [2024-01-25]. https://arxiv.org/abs/2302.05729. [119] HUANG Q, TAO M, AN Z, et al. Lawyer LLaMA technical report[EB/OL]. [2024-01-25]. https://arxiv.org/abs/2305.15062. [120] YAO F, XIAO C, WANG X, et al. LEVEN: a large-scale Chinese legal event detection dataset[C]//Findings of the Association for Computational Linguistics: ACL 2022, Dublin, May 22-27, 2022. Stroudsburg: ACL, 2022: 183-201. [121] CASCELLA M, MONTOMOLI J, BELLINI V, et al. Evaluating the feasibility of ChatGPT in healthcare: an analysis of multiple clinical and research scenarios[J]. Journal of Medical Systems, 2023, 47(1): 33. [122] CHERVENAK J, LIEMAN H, BLANCO-BREINDEL M, et al. The promise and peril of using a large language model to obtain clinical information: ChatGPT performs strongly as a fertility counseling tool with limitations[J]. Fertility and Sterility, 2023, 120(3): 575-583. [123] DUONG D, SOLOMON B D. Analysis of large-language model versus human performance for genetics questions[J]. European Journal of Human Genetics, 2024, 32: 466-468. [124] GILSON A, SAFRANEK C W, HUANG T, et al. How does ChatGPT perform on the United States medical licen-sing examination? The implications of large language models for medical education and knowledge assessment[J]. JMIR Medical Education, 2023, 9(1): e45312. [125] XIONG H, WANG S, ZHU Y, et al. DoctorGLM: fine-tuning your Chinese doctor is not a herculean task[EB/OL]. [2024-01-25]. https://arxiv.org/abs/2304.01097. [126] LI S T. Ben-Tsao Gong-Mu (Chinese botanical encyclopedia)[M]. Taipei, China: Great Taipei Publishing, 1990. [127] LIANG Y, HUANG Y. Bian Que, the founder of diagnostics of traditional Chinese medicine[J]. Journal of Traditional Chinese Medical Sciences, 2022, 9(2): 93-94. [128] ZHANG H, CHEN J, JIANG F, et al. HuatuoGPT, towards taming language model to be a doctor[C]//Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, Dec 2023. Stroudsburg: ACL, 2023: 10859-10885. [129] SINGHAL K, TU T, GOTTWEIS J, et al. Towards expert-level medical question answering with large language mo-dels[EB/OL]. [2024-01-25]. https://arxiv.org/abs/2305.09617. [130] HALL P, GILL N, SCHMIDT N. Proposed guidelines for the responsible use of explainable machine learning[EB/OL]. [2024-01-25]. https://arxiv.org/abs/1906.03533. [131] 陈珂锐, 孟小峰. 机器学习的可解释性[J]. 计算机研究与发展, 2020, 57(9): 1971-1986. CHEN K Y, MENG X F. Interpretability of machine learn-ing[J]. Journal of Computer Research and Development, 2020, 57(9): 1971-1986. [132] 梁峥, 王宏志, 戴加佳, 等. 预训练语言模型实体匹配的可解释性[J]. 软件学报, 2023, 34(3): 1087-1108. LIANG Z, WANG H Z, DAI J J, et al. Interpretability of entity matching based on pre-trained language model[J]. Journal of Software, 2023, 34(3): 1087-1108. [133] 王冬丽, 杨珊, 欧阳万里, 等. 人工智能可解释性: 发展与应用[J]. 计算机科学, 2023, 50(S1): 19-25. WANG D L,YANG S, OUYANG W L, et al. Explainability of artificial intelligence: development and application[J]. Computer Science, 2023, 50(S1): 19-25. [134] MARKUS A F, KORS J A, RIJNBEEK P R. The role of explainability in creating trustworthy artificial intelligence for health care: a comprehensive survey of the terminology, design choices, and evaluation strategies[J]. Journal of Biomedical Informatics, 2021, 113: 103655. [135] 纪守领, 李进锋, 杜天宇, 等. 机器学习模型可解释性方法、应用与安全研究综述[J]. 计算机研究与发展, 2019, 56(10): 2071-2096. JI S L, LI J F, DU T Y, et al. Survey on techniques, applications and security of machine learning interpretability [J]. Journal of Computer Research and Development, 2019, 56(10): 2071-2096. [136] 成科扬, 王宁, 师文喜, 等. 深度学习可解释性研究进展[J]. 计算机研究与发展, 2020, 57(6): 1208-1217. CHENG K Y, WANG N, SHI W X, et al. Research advances in the interpretability of deep learning [J]. Journal of Computer Research and Development, 2020, 57(6): 1208-1217. [137] DOSHI-VELEZ F, KIM B. Considerations for evaluation and generalization in interpretable machine learning[M]// Explainable and Interpretable Models in Computer Vision and Machine Learning. Cham: Springer, 2018: 3-17. [138] LIPTON Z C. In machine learning, the concept of interpretability is both important and slippery[J]. Queue, 2018, 16: 28. [139] SHEN Y, WANG L, CHEN Y, et al. An interpretability evaluation benchmark for pre-trained language models[EB/OL]. [2024-01-25]. https://arxiv.org/abs/2207.13948. [140] ROSS A, CHEN N, HANG E Z, et al. Evaluating the interpretability of generative models by interactive reconstruction[C]//Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, Yokohama, May 8-13, 2021. New York: ACM, 2021: 80. [141] WANG Q, ANIKINA T, FELDHUS N, et al. LLMCheckup: conversational examination of large language models via interpretability tools[EB/OL]. [2024-01-20]. https://arxiv.org/abs/2401.12576. [142] LEI Y, LIAN J, YAO J, et al. RecExplainer: aligning large language models for recommendation model interpretability[EB/OL]. [2024-01-25]. https://arxiv.org/abs/2311.10947. [143] YANG K, JI S, ZHANG T, et al. Towards interpretable mental health analysis with large language models[C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, Dec 6-10, 2023. Stroudsburg: ACL, 2023: 6056-6077. [144] MA W, ZHAO M, XIE X, et al. Are code pre-trained models powerful to learn code syntax and semantics?[EB/OL]. [2024-01-25]. https://arxiv.org/abs/2212.10017. [145] LI Y, ZHANG T, LUO X, et al. Do pre-trained language models indeed understand software engineering tasks?[J]. IEEE Transactions on Software Engineering, 2023, 49(10): 4639-4655. [146] HOODA A, CHRISTODORESCU M, ALLAMANIS M, et al. Do large code models understand programming concepts? A black-box approach[EB/OL]. [2024-01-25]. https://arxiv.org/abs/2402.05980. [147] RODRIGUEZ-CARDENAS D, PALACIO D N, KHATI D, et al. Benchmarking causal study to interpret large language models for source code[C]//Proceedings of the 2023 IEEE International Conference on Software Maintenance and Evolution, Bogotá, Oct 1-6, 2023. Piscataway: IEEE, 2023: 329-334. [148] ROY S, LABERGE G, ROY B, et al. Why don’t XAI techniques agree? Characterizing the disagreements bet-ween post-hoc explanations of defect predictions[C]//Proceedings of the 2022 IEEE International Conference on Software Maintenance and Evolution, Limassol, Oct 3-7, 2022. Piscataway: IEEE, 2022: 444-448. [149] JI Z, MA P, LI Z, et al. Benchmarking and explaining large language model-based code generation: a causality-centric approach[EB/OL]. [2024-01-25]. https://arxiv.org/abs/2310. 06680. [150] PALACIO D N, VELASCO A, RODRIGUEZ-CARDENAS D, et al. Evaluating and explaining large language models for code using syntactic structures[EB/OL]. [2024-01-31]. https://arxiv.org/abs/2308.03873. [151] ZHANG T, CHEN Z, ZHU Y, et al. Interpretable program synthesis[C]//Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, Yokohama, May 8-13, 2021. New York: ACM, 2021. [152] 杨朋波, 桑基韬, 张彪, 等. 面向图像分类的深度模型可解释性研究综述[J]. 软件学报, 2023, 34(1): 230-254. YANG P B, SANG J T, ZHANG B, et al. Survey on interpretability of deep models for image classification[J]. Journal of Software, 2023, 34(1): 230-254. [153] CHEN H, JI Y. Adversarial training for improving model robustness? Look at both prediction and interpretation[C]//Proceedings of the 36th AAAI Conference on Artificial Intelligence. Menlo Park: AAAI, 2022: 10463-10472. [154] 王昱颖, 张敏, 杨晶然, 等. 深度学习模型中的公平性研究[J]. 软件学报, 2023, 34(9): 4037-4055. WANG Y Y, ZHANG M, YANG J R, et al. Research on fairness in deep learning models[J]. Journal of Software,2023,34(9): 4037-4055. [155] 刘文炎, 沈楚云, 王祥丰, 等. 可信机器学习的公平性综述[J]. 软件学报, 2021, 32(5): 1404-1426. LIU W Y, SHEN C Y, WANG X F, et al. Survey on fairness in trustworthy machine learning[J]. Journal of Software, 2021, 32(5): 1404-1426. [156] ZHOU Y, MURESANU A I, HAN Z, et al. Large language models are human-level prompt engineers[EB/OL]. [2024-01-31]. https://arxiv.org/abs/2211.01910. [157] SAH C K, XIAOLI D L, ISLAM M M. Unveiling bias in fairness evaluations of large language models: a critical literature review of music and movie recommendation systems[EB/OL]. [2024-01-25]. https://arxiv.org/abs/2401.04057. [158] BI G, SHEN L, XIE Y, et al. A group fairness lens for large language models[EB/OL]. [2024-01-31]. https://arxiv. org/abs/2312.15478. [159] FREIBERGER V, BUCHMANN E. Fairness certification for natural language processing and large language models[EB/OL]. [2024-01-31]. https://arxiv.org/abs/2401.01262. [160] HUANG P S, ZHANG H, JIANG R, et al. Reducing sentiment bias in language models via counterfactual evaluation[C]//Findings of the Association for Computational Linguistics: EMNLP 2020. Stroudsburg: ACL, 2020: 65-83. [161] ZHUO T Y, HUANG Y, CHEN C, et al. Exploring ai ethics of ChatGPT: a diagnostic analysis[EB/OL]. [2024-01-31]. https://arxiv.org/abs/2301.12867. [162] FERRARA E. Should ChatGPT be biased? Challenges and risks of bias in large language models[EB/OL]. [2024-01-31]. https://arxiv.org/abs/2304.03738. [163] HARTMANN J, SCHWENZOW J, WITTE M. The political ideology of conversational AI: converging evidence on ChatGPT??s pro-environmental, left-libertarian orientation[EB/OL]. [2024-01-31]. https://arxiv.org/abs/2301.01768. [164] LI Y, ZHANG Y. Fairness of ChatGPT[EB/OL]. [2024-01-31]. https://arxiv.org/abs/2305.18569. [165] PARRISH A, CHEN A, NANGIA N, et al. BBQ: a hand-built bias benchmark for question answering[C]//Findings of the Association for Computational Linguistics: ACL 2022, Dublin, May 22-27, 2022. Stroudsburg: ACL, 2022: 2086-2105. [166] KHASHABI D, MIN S, KHOT T, et al. UnifiedQA: crossing format boundaries with a single QA system[C]//Findings of the Association for Computational Linguistics: EMNLP 2020. Stroudsburg: ACL, 2020: 1896-1907. [167] RUTINOWSKI J, FRANKE S, ENDENDYK J, et al. The self-perception and political biases of ChatGPT[EB/OL]. [2024-01-31]. https://arxiv.org/abs/2304.07333. [168] FERREIRA S L C, CAIRES A O, BORGES T S, et al. Robustness evaluation in analytical methods optimized using experimental designs[J]. Microchemical Journal, 2017, 131: 163-169. [169] BRENDEL W, RAUBER J, KüMMERER M, et al. Accurate, reliable and fast robustness evaluation[C]//Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, Vancouver, Dec 8-14, 2019: 12841-12851. [170] KASHYAP A R, MEHNAZ L, MALIK B, et al. Analyzing the domain robustness of pretrained language models, layer by layer[C]//Proceedings of the 2nd Workshop on Domain Adaptation for NLP. Stroudsburg: ACL, 2021: 222-244. [171] LIU Q, JI S, LIU C, et al. A practical black-box attack on source code authorship identification classifiers[J]. IEEE Transactions on Information Forensics and Security, 2021, 16: 3620-3633. [172] ZHANG C, WANG Z, MANGAL R, et al. Transfer attacks and defenses for large language models on coding tasks[EB/OL]. [2024-01-31]. https://arxiv.org/abs/2311.13445. [173] LI Z, PENG B, HE P, et al. Evaluating the instruction-following robustness of large language models to prompt injection[EB/OL]. [2024-01-31]. https://arxiv.org/abs/2308.10819. [174] LI Y, GUO Y, GUERIN F, et al. Evaluating large language models for generalization and robustness via data compression[EB/OL]. [2024-01-31]. https://arxiv.org/abs/2402.00861. [175] QIU H, ZHANG S, LI A, et al. Latent jailbreak: a benchmark for evaluating text safety and output robustness of large language models[EB/OL]. [2024-01-31]. https://arxiv. org/abs/2307.08487. [176] ZHAO Y, PANG T, DU C, et al. On evaluating adversarial robustness of large vision-language models[C]//Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, New Orleans, Dec 10-16, 2023: 54111-54138. [177] LI Z, QIU W, MA P, et al. An empirical study on large language models in accuracy and robustness under Chinese industrial scenarios[EB/OL]. [2024-01-31]. https://arxiv.org/ abs/2402.01723. [178] NEUMANN P G. Computer system-security evaluation[C]//Proceedings of the 1978 International Workshop on Managing Requirements Knowledge. Washington: IEEE Computer Society, 1978: 1087. [179] TOUBIANA V, NARAYANAN A, BONEH D, et al. Adnostic: privacy preserving targeted advertising[C]//Proceedings of the 2010 Network and Distributed System Symposium, San Diego, 2010. [180] PLANT R, GIUFFRIDA V, GKATZIA D. You are what you write: preserving privacy in the era of large language models[EB/OL]. [2024-01-31]. https://arxiv.org/abs/2204. 09391. [181] HARDT M, TALWAR K. On the geometry of differential privacy[C]//Proceedings of the 42nd ACM Symposium on Theory of Computing, Cambridge, Jun 5-8, 2010. New York: ACM, 2010: 705-714. [182] ZHANG C, WANG Z, MANGAL R, et al. Transfer attacks and defenses for large language models on coding tasks[EB/OL]. [2024-01-31]. https://arxiv.org/abs/2311.13445. [183] FENG R, YAN Z, PENG S, et al. Automated detection of password leakage from public github repositories[C]//Proceedings of the 44th International Conference on Software Engineering, Pittsburgh, May 25-27, 2022. New York: ACM, 2022: 175-186. [184] JUNGWIRTH G, SAHA A, SCHR?DER M, et al. Connecting the .dotfiles: checked-in secret exposure with extra (lateral movement) steps[C]//Proceedings of the 2023 IEEE/ACM 20th International Conference on Mining Software Repositories, Melbourne, May 15-16, 2023. Piscataway: IEEE, 2023: 322-333. [185] VATS A, LIU Z, SU P, et al. Recovering from privacy-preserving masking with large language models[C]//Proceedings of the 2024 IEEE International Conference on Acoustics, Speech and Signal Processing, Seoul, Apr 14-19, 2024. Piscataway: IEEE, 2024: 10771-10775. [186] ABBASIAN M, AZIMI I, RAHMANI A M, et al. Conversational health agents: a personalized LLM-powered agent framework[EB/OL]. [2024-02-02]. https://arxiv.org/abs/2310.02374. [187] FRIED D, AGHAJANYAN A, LIN J, et al. InCoder: a generative model for code infilling and synthesis[EB/OL]. [2024-02-02]. https://arxiv.org/abs/2204.05999. [188] ALLAL L B, LI R, KOCETKOV D, et al. SantaCoder: don??t reach for the stars![EB/OL]. [2024-02-02]. https://arxiv.org/abs/2301.03988. [189] LI R, ALLAL L B, ZI Y, et al. StarCoder: may the source be with you![EB/OL]. [2024-02-02]. https://arxiv.org/abs/2305.06161. [190] LYU C, XU J, WANG L. New trends in machine translation using large language models: case examples with ChatGPT[EB/OL]. [2024-02-02]. https://arxiv.org/abs/2305.01181. [191] ZHANG D, LI S, ZHANG X, et al. SpeechGPT: empowering large language models with intrinsic cross-modal conversational abilities[C]//Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, Dec 2023. Stroudsburg: ACL, 2023: 15757-15773. [192] WU W, JIANG C, JIANG Y, et al. Do PLMs know and understand ontological knowledge?[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Toronto, Jul 9-14, 2023. Stroudsburg: ACL, 2023: 3080-3101. [193] YIN Z, SUN Q, GUO Q, et al. Do large language models know what they don??t know?[C]//Findings of the Association for Computational Linguistics: ACL 2023, Toronto, Jul 9-14, 2023. Stroudsburg: ACL, 2023: 8653-8665. [194] CHARAN P V, CHUNDURI H, ANAND P M, et al. From text to MITRE techniques: exploring the malicious use of large language models for generating cyber attack payloads[EB/OL]. [2024-02-02]. https://arxiv.org/abs/2305.15336. [195] DERNER E, BATISTI? K. Beyond the safeguards: exploring the security risks of ChatGPT[EB/OL]. [2024-02-02]. https://arxiv.org/abs/2305.08005. [196] DASH B, SHARMA P. Are ChatGPT and deepfake algorithms endangering the cybersecurity industry? A review[J]. International Journal of Engineering and Applied Sciences, 2023. DOI:10.31873/IJEAS.10.1.01. [197] TSIGARIS P, TEIXEIRA DA SILVA J A. Can ChatGPT be trusted to provide reliable estimates?[J]. Accountability in Research, 2023. DOI: 10.1080/08989621.2023.2179919. |
[1] | 吴涛, 曹新汶, 先兴平, 袁霖, 张殊, 崔灿一星, 田侃. 图神经网络对抗攻击与鲁棒性评测前沿进展[J]. 计算机科学与探索, 2024, 18(8): 1935-1959. |
[2] | 考文涛, 李明, 马金刚. 卷积神经网络在结直肠息肉辅助诊断中的应用综述[J]. 计算机科学与探索, 2024, 18(3): 627-645. |
[3] | 孙家泽, 唐彦梅, 王曙燕. 利用GAN和特征金字塔的模型鲁棒性优化方法[J]. 计算机科学与探索, 2023, 17(5): 1139-1146. |
[4] | 陈洁, 李帅, 赵姝, 张燕平. 不确定域特征表示的鲁棒性情感分析模型[J]. 计算机科学与探索, 2023, 17(12): 3020-3028. |
[5] | 官铮, 胡扬, 杨志军, 何敏. 分布式WLAN全双工链路加权调度算法[J]. 计算机科学与探索, 2022, 16(2): 372-383. |
[6] | 范瑞东, 侯臣平. 鲁棒自加权的多视图子空间聚类[J]. 计算机科学与探索, 2021, 15(6): 1062-1073. |
[7] | 陈睿龙, 罗磊, 蔡志平, 马文涛. 基于深度学习的实时吸烟检测算法[J]. 计算机科学与探索, 2021, 15(2): 327-337. |
[8] | 马翔, 邓赵红, 王士同. 多粒度融合的模糊规则系统图像特征学习[J]. 计算机科学与探索, 2021, 15(1): 173-184. |
[9] | 刘国庆,卢桂馥,周胜,宣东东,曹阿龙. 非负低秩图嵌入算法[J]. 计算机科学与探索, 2020, 14(3): 502-512. |
[10] | 杨萌林,张文生. 分类激活图增强的图像分类算法[J]. 计算机科学与探索, 2020, 14(1): 149-158. |
[11] | 李可欣,徐彬,高克宁. 异常值自识别的低秩矩阵补全方法[J]. 计算机科学与探索, 2019, 13(8): 1272-1279. |
[12] | 曲长波,吴德阳,肖成龙,郭鹏飞,姜思瑶. RGB空间彩色零水印算法[J]. 计算机科学与探索, 2019, 13(4): 666-680. |
[13] | 曹雅,邓赵红,王士同. 单调约束的TSK模糊系统模型[J]. 计算机科学与探索, 2018, 12(9): 1487-1495. |
[14] | 顾高升,葛洪伟,周梦璇. 基于广义均值的鲁棒典型相关分析算法[J]. 计算机科学与探索, 2017, 11(7): 1140-1149. |
[15] | 陈俊勇,邓赵红,王士同. 区间二型模糊子空间0阶TSK系统[J]. 计算机科学与探索, 2017, 11(10): 1652-1661. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||