
Journal of Frontiers of Computer Science and Technology ›› 2025, Vol. 19 ›› Issue (9): 2493-2505.DOI: 10.3778/j.issn.1673-9418.2409074
• Artificial Intelligence·Pattern Recognition • Previous Articles Next Articles
CHANG Jian, ZHANG Hui, JIN Haibo, WANG Bingbing
Online:2025-09-01
Published:2025-09-01
常戬,张辉,金海波,王冰冰
CHANG Jian, ZHANG Hui, JIN Haibo, WANG Bingbing. Multistage Learning for SBERT Word-Level Adversarial Sample Detection[J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(9): 2493-2505.
常戬, 张辉, 金海波, 王冰冰. 多阶段学习的SBERT单词级文本对抗性样本检测[J]. 计算机科学与探索, 2025, 19(9): 2493-2505.
Add to citation manager EndNote|Ris|BibTeX
URL: http://fcst.ceaj.org/EN/10.3778/j.issn.1673-9418.2409074
| [1] SZEGEDY C, ZAREMBA W, SUTSKEVER I, et al. Intriguing properties of neural networks[EB/OL]. [2024-07-18]. https://arxiv.org/abs/1312.6199. [2] GOODFELLOW I J, SHLENS J, SZEGEDY C, et al. Explaining and harnessing adversarial examples[EB/OL]. [2024-07-18]. https://arxiv.org/abs/1412.6572. [3] MOOSAVI-DEZFOOLI S M, FAWZI A, FROSSARD P. DeepFool: a simple and accurate method to fool deep neural networks[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 2574-2582. [4] CARLINI N, WAGNER D. Towards evaluating the robustness of neural networks[C]//Proceedings of the 2017 IEEE Symposium on Security and Privacy. Piscataway: IEEE, 2017: 39-57. [5] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding [C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics. Stroudsburg: ACL, 2019: 4171-4186. [6] BROWN T B, MANN B, RYDER N, et al. Language models are few-shot learners[C]//Advances in Neural Information Processing Systems 33, 2020: 1877-1901. [7] JIN D, JIN Z J, ZHOU J T, et al. Is BERT really robust? A strong baseline for natural language attack on text classification and entailment[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(5): 8018-8025. [8] VITORINO J, MAIA E, PRA?A I. Adversarial evasion attack efficiency against large language models[EB/OL]. [2024-07-18]. https://arxiv.org/abs/ 2406.08050. [9] XU X, KONG K, LIU N, et al. An LLM can fool itself: a prompt-based adversarial attack[EB/OL]. [2024-07-18]. https://arxiv.org/abs/ 2310.13345. [10] MIYATO T, DAI A M, GOODFELLOW I J. Adversarial training methods for semi-supervised text classification[EB/OL]. [2024-07-18]. https://arxiv.org/abs/1605.07725. [11] LI J, JI S, DU T, et al. TextBugger: generating adversarial text against real-world applications[EB/OL]. [2024-07-18]. https://arxiv.org/abs/1812.05271. [12] WIYATNO R R, XU A, DIA O A, et al. Adversarial examples in modern machine learning: a review[EB/OL]. [2024-07-18]. https://arxiv.org/abs/1911.05268. [13] MADRY A, MAKELOV A, SCHMIDT L, et al. Towards deep learning models resistant to adversarial attacks[EB/OL]. [2024-07-18]. https://arxiv.org/abs/1706.06083. [14] AKHTAR N, MIAN A, KARDAN N, et al. Advances in adversarial attacks and defenses in computer vision: a survey[J]. IEEE Access, 2021, 9: 155161-155196. [15] RAINA V, TAN S, CEVHER V, et al. Extreme miscalibration and the illusion of adversarial robustness[C]//Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2024: 2500-2525. [16] BAO R, ZHENG R, DING L, et al. CASN: class-aware score network for textual adversarial detection[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2023: 671-687. [17] YOO K, KIM J, JANG J, et al. Detection of adversarial examples in text classification: benchmark and baseline via robust density estimation[C]//Findings of the Association for Computational Linguistics: ACL 2022. Stroudsburg: ACL, 2022: 3656-3672. [18] ZHOU Y C, JIANG J Y, CHANG K W, et al. Learning to discriminate perturbations for blocking adversarial attacks in text classification[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg: ACL, 2019: 4904-4913. [19] MOZES M, STENETORP P, KLEINBERG B, et al. Frequency-guided word substitutions for detecting textual adversarial examples[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg: ACL, 2021: 171-186. [20] LIU N, DRAS M, ZHANG W E. Detecting textual adversarial examples based on distributional characteristics of data representations[C]//Proceedings of the 7th Workshop on Representation Learning for NLP. Stroudsburg: ACL, 2022: 78-90. [21] MOSCA E, AGARWAL S, RANDO RAMíREZ J, et al. “That is a suspicious reaction!”: interpreting logits variation to detect NLP adversarial attacks[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2022: 7806-7816. [22] HADSELL R, CHOPRA S, LECUN Y. Dimensionality reduction by learning an invariant mapping[C]//Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2006: 1735-1742. [23] REIMERS N, GUREVYCH I. Sentence-BERT: sentence embeddings using siamese BERT-networks[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg: ACL, 2019: 3982-3992. [24] GU A, DAO T M. Mamba: linear-time sequence modeling with selective state spaces[EB/OL]. [2024-07-19]. https://arxiv.org/abs/2312.00752. [25] REN S H, DENG Y H, HE K, et al. Generating natural language adversarial examples through probability weighted word saliency[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2019: 1085-1097. [26] LI L Y, MA R T, GUO Q P, et al. BERT-ATTACK: adversarial attack against BERT using BERT[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2020: 6193-6202. [27] HE K M, FAN H Q, WU Y X, et al. Momentum contrast for unsupervised visual representation learning[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 9726-9735. [28] TUNSTALL L, REIMERS N, JO U E, et al. Efficient few-shot learning without prompts[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2022: 3638-3652. [29] OHASHI S, TAKAYAMA J, KAJIWARA T, et al. Text classification with negative supervision[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 351-357. [30] ROBINSON J, CHUANG C Y, SRA S, et al. Contrastive learning with hard negative samples[EB/OL]. [2024-07-19]. https://arxiv.org/abs/2010.04592. [31] ZANTEDESCHI V, NICOLAE M I, RAWAT A. Efficient defenses against adversarial attacks[EB/OL]. [2024-07-19]. https://arxiv.org/abs/1707.06728. [32] LIU X Q, CHENG M H, ZHANG H, et al. Towards robust neural networks via random self-ensemble[C]//Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 381-397. [33] LIU X Q, XIAO T S, SI S, et al. How does noise help robustness? Explanation and exploration under the neural SDE framework[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 279-287. [34] SONG Y, ERMON S. Generative modeling by estimating gradients of the data distribution[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook: Curran Associates, 2019: 1067. [35] KIRKPATRICK S, GELATT C D, VECCHI M P. Optimization by simulated annealing[J]. Science, 1983, 220(4598): 671-680. [36] KINGMA D P, WELLING M. Auto-encoding variational Bayes[C]//Proceedings of the 2nd International Conference on Learning Representations, 2014. [37] VINCENT P, LAROCHELLE H, BENGIO Y, et al. Extracting and composing robust features with denoising autoencoders[C]//Proceedings of the 25th International Conference on Machine Learning, 2008: 1096-1103. [38] GUNEL B, DU J, CONNEAU A, et al. Supervised contrastive learning for pre-trained language model fine-tuning[EB/OL]. [ 2024-07-19]. https://arxiv.org/abs/2011.01403. [39] KHOSLA P, TETERWAK P, WANG C, et al. Supervised contrastive learning[C]//Advances in Neural Information Processing Systems 33, 2020: 18661-18673. [40] ZENG G Y, QI F C, ZHOU Q R, et al. OpenAttack: an open-source textual adversarial attack toolkit[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations. Stroudsburg: ACL, 2021: 363-371. [41] CER D, YANG Y F, KONG S Y, et al. Universal sentence encoder for English[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Stroudsburg: ACL, 2018: 169-174. [42] MCINNES L, HEALY J, MELVILLE J. UMAP: uniform manifold approximation and projection for dimension reduction[EB/OL]. [2024-07-19]. https://arxiv.org/abs/1802.03426. [43] LUNDBERG S, LEE S. A unified approach to interpreting model predictions[EB/OL]. [2024-07-19]. https://arxiv.org/abs/1705.07874. |
| [1] | LIU Ying, FENG Xiaodong, HE Jinglu. Zero-Shot Image Classification Based on Feature Enhancement and Contrastive Embedding [J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(8): 2123-2134. |
| [2] | SHI Jiliang, ZHANG Qian, YANG Sihong, LIU Shuang, TENG Lin, BAI Wuer. Image Inpainting Guided by Image Smoothness Structure [J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(8): 2149-2160. |
| [3] | HU Zhongze, QIN Hongchao, LI Zhenjun, LI Yanhui, LI Ronghua, WANG Guoren. TCGCL: Complex Network Traffic Classification Algorithm Based on Graph Contrastive Learning [J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(5): 1230-1240. |
| [4] | YUAN Jiang, MA Ji, ZHOU Dengwen. Boosting Degradation Representation Learning for Blind Image Super-Resolution [J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(5): 1252-1263. |
| [5] | CAO Siyuan, CHEN Songcan. Frequency Domain mixup Augmentation and logit Compensation for Self-Supervised Multi-label Imbalanced Electrocardiogram Classification [J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(4): 1011-1020. |
| [6] | WEI Chuyuan, YUAN Baojie, WANG Changdong. Multi-level User Interest and Multi-intent Fusion for Next Basket Recommendation [J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(3): 749-763. |
| [7] | YUAN Lining, FENG Wengang, LIU Zhao. Node Classification Based on Kolmogorov-Arnold Networks [J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(3): 645-656. |
| [8] | WANG Yonggui, YU Qi. Graph Isomorphism and Hybrid-Order Residual Gated Graph Neural Network for Session-Based Recommendation [J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(2): 502-512. |
| [9] | XU Zhihong, ZHANG Huibin, DONG Yongfeng, WANG Liqin, WANG Xu. Question Feature Enhanced Knowledge Tracing Model [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(9): 2466-2475. |
| [10] | ZHU Weiwei, ZHANG Yijia, LIU Guantong, LU Mingyu, LIN Hongfei. Psychological Analysis of College Students?? Anxiety Based on Domain Comparison Adaptive Model [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(7): 1900-1910. |
| [11] | QIAO Zifeng, QIN Hongchao, HU Jingjing, LI Ronghua, WANG Guoren. Knowledge Graph Completion Algorithm with Multi-view Contrastive Learning [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(4): 1001-1009. |
| [12] | WU Xiang, GAO Yujin, LI Ronghua, WANG Guoren. Temporal Link Prediction with Community-Level Information [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(10): 2668-2677. |
| [13] | HAN Xu, WU Feng. Offline Meta-Reinforcement Learning with Contrastive Prediction [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(8): 1917-1927. |
| [14] | WANG Min, ZHAO Peng, GUO Xinping, MIN Fan. Fine-Grained Visual Categorization: Deep Pairwise Feature Comparison Interaction Algorithm [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(11): 2663-2675. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||
/D:/magtech/JO/Jwk3_kxyts/WEB-INF/classes/