计算机科学与探索 ›› 2024, Vol. 18 ›› Issue (2): 279-300.DOI: 10.3778/j.issn.1673-9418.2304081
常钰,王钢,朱鹏,孔令飞,何京恒
出版日期:
2024-02-01
发布日期:
2024-02-01
CHANG Yu, WANG Gang, ZHU Peng, KONG Lingfei, HE Jingheng
Online:
2024-02-01
Published:
2024-02-01
摘要: 工业互联网安全知识图谱能够在丰富安全概念语义关系、提高安全知识库质量和增强安全态势可视化分析能力等方面发挥重要作用,已经成为认知、溯源和防护针对新能源工业控制系统攻击的关键。但是,与通用领域知识图谱构建相比,工业互联网安全知识图谱构建的各个环节仍然存在许多问题,影响了其实际应用效果。介绍了工业互联网安全知识图谱的概念、意义和其与通用知识图谱的区别;概括了工业互联网安全知识图谱本体构建的相关工作及其作用;重点研究了在工业互联网安全背景下,构建知识图谱的三个关键环节,即命名实体识别、关系抽取和共指消解的相关工作。对于每个环节,详细报告了该环节在领域背景下的发展历史和研究现状,深入分析了该环节面临的领域特有挑战,如非连续实体识别问题、候选词提取问题和缺乏领域高质量数据集等,并针对特有挑战展望了该环节未来的研究方向,为进一步提升工业互联网安全知识图谱的质量和实用性,从而更有效地应对新兴威胁和攻击提供借鉴和启示。
常钰, 王钢, 朱鹏, 孔令飞, 何京恒. 工业互联网安全知识图谱构建研究综述[J]. 计算机科学与探索, 2024, 18(2): 279-300.
CHANG Yu, WANG Gang, ZHU Peng, KONG Lingfei, HE Jingheng. Survey of Research on Construction Method of Industry Internet Security Knowledge Graph[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(2): 279-300.
[1] 国家能源局. 2021年全国电力工业统计数据[EB/OL]. (2022-01-26) [2023-04-19]. http://www.nea.gov.cn/2022-01/26/c_1310441589.htm. National Energy Administration. 2021 National power industry statistics[EB/OL]. (2022-01-26) [2023-04-19]. http://www.nea.gov.cn/2022-01/26/c_1310441589.htm. [2] 王青, 孙頔, 张海霞, 等. 中国光伏行业2021年回顾与2022年展望[J]. 电气时代, 2022(5): 20-28. WANG Q, SUN D, ZHANG H X, et al. A review of China’s photovoltaic industry in 2021 and prospects for 2022[J]. Electric Age, 2022(5): 20-28. [3] 徐伟, 孔坚, 毛庆梅, 等. 工业控制系统安全现状及应对策略[J]. 网络安全技术与应用, 2021(9): 115-117. XU W, KONG J, MAO Q M, et al. Safety status and countermeasures of industry control system[J]. Network Security Technology & Application, 2021(9): 115-117. [4] 郑少波, 徐伟, 石彬. 工业控制系统安全现状[J]. 网络安全技术与应用, 2020(5): 111-113. ZHENG S B, XU W, SHI B. Safety status of industry control system[J]. Network Security Technology & Application, 2020(5): 111-113. [5] 360数字安全集团. 2022全球高级持续性威胁(APT)研究报告[EB/OL]. (2023-01-08) [2023-04-19]. https://360.net/about/news/article63c8e20258f02a002a0d76de. 360 Digital Security Group. 2022 Global advanced persistent threat(APT) research report[EB/OL]. (2023-01-08) [2023-04-19]. https://360.net/about/news/article63c8e20258f02a002a0d76de. [6] 绿盟科技. 践行安全知识图谱, 携手迈进认知智能[EB/OL]. (2022-05-23) [2023-04-19]. https://www.nsfocus.com.cn/html/2022/137_0523/179.html. NSFOCUS. Practice the safety knowledge graph, and work together to advance cognitive intelligence[EB/OL]. (2022-05-23) [2023-04-19]. https://www.nsfocus.com.cn/html/2022/137_0523/179.html. [7] 绿盟科技. 安全知识图谱[EB/OL]. (2022-01-07) [2023-04-19]. https://www.zhihu.com/column/c_1446900744649240576. NSFOCUS. Security knowledge graph[EB/OL]. (2022-01-07) [2023-04-19]. https://www.zhihu.com/column/c_1446900744649240576. [8] BERNERS-LEE T. What the semantic web can represent[Z]. 1998. [9] BERNERS-LEE T. Semantic web road map[Z]. 1998. [10] BERNERS-LEE T, CHEN Y, CHILTON L, et al. Tabulator: exploring and analyzing linked data on the semantic web[C]//Proceedings of the 3rd International Semantic Web User Interaction Workshop, Athens, Nov 6, 2006. [11] MILLER G A. Wordnet: a lexical database for English[J]. Commun ACM, 1995, 38(11): 39-41. [12] BAKER C F, FILLMORE C J, LOWE J B. The Berkeley framenet project[C]//Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics, Montreal, Aug, 1998. Stroudsburg: ACL, 1998. [13] DONG Z, DONG Q. HowNet—a hybrid language and knowledge resource[C]//Proceedings of the 2003 International Conference on Natural Language Processing and Knowledge Engineering, Beijing, Oct 26-29, 2003. Piscataway: IEEE, 2003: 820-824. [14] SPEER R, CHIN J, HAVASI C. ConceptNet 5.5: an open multi-lingual graph of general knowledge[J]. arXiv:1612.03975, 2016. [15] BOLLACKER K, EVANS C, PARITOSH P, et al. Freebase: a collaboratively created graph database for structuring human knowledge[C]//Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, Vancouver, Jun 10-12, 2008. New York: ACM, 2008: 1247-1250. [16] AUER S, BIZER C, KOBILAROV G, et al. DBpedia: a nucleus for a web of open data[C]//Proceedings of the 6th International Semantic Web Conference, Busan, Nov 11-15, 2007. Berlin, Heidelberg: Springer, 2007: 722-735. [17] SUCHANEK F M, KASNECI G, WEIKUM G. Yago: a core of semantic knowledge[C]//Proceedings of the 16th International Conference on World Wide Web, Banff, May 8-12, 2007. New York: ACM, 2007: 697-706. [18] RESEARCH W. Wolframalpha[EB/OL]. (2023-01-03) [2023-07-28]. https://www.wolframalpha.com/. [19] MITCHELL T, COHEN W, HRUSCHKA E, et al. Never-ending learning[J]. Communications of the ACM, 2018, 61(5): 103-115. [20] NAVIGLI R, PONZETTO S P. BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network[J]. Artificial Intelligence, 2012, 193: 217-250. [21] VRANDE?I? D, KR?TZSCH M. Wikidata: a free collaborative knowledgebase[J]. Communications of the ACM, 2014, 57(10): 78-85. [22] DONG X, GABRILOVICH E, HEITZ G, et al. Knowledge vault: a web-scale approach to probabilistic knowledge fusion[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, Aug 24-27, 2014. New York: ACM, 2014: 601-610. [23] NIU X, SUN X, WANG H, et al. Zhishi.Me-weaving Chinese linking open data[C]//Proceedings of the 10th International Semantic Web Conference, Bonn, Oct 23-27, 2011. Berlin, Heidelberg: Springer, 2011: 205-220. [24] WANG Z, LI J Z, WANG Z, et al. XLore: a large-scale English-Chinese bilingual knowledge graph[C]//Proceedings of the 12th International Semantic Web Conference and the 1st Australasian Semantic Web Conference, Sydney, Oct 21-25, 2013. Berlin,Heidelberg: Springer, 2013: 121-124. [25] Openkg.Cn[EB/OL]. [2023-07-28]. http://openkg.cn/. [26] XU B, XU Y, LIANG J, et al. CN-DBpedia: a never-ending Chinese knowledge extraction system[C]//Proceedings of the 30th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, Arras, Jun 27-30, 2017. Cham: Springer, 2017: 428-438. [27] 中国电子技术标准化研究院. 认知智能时代: 知识图谱实践案例集[J]. 信息技术与标准化, 2021(3): 5. China Electronics Standardization Institute. The era of cognitive intelligence: knowledge graph practice case set[J]. Information Technology & Standardization, 2021(3): 5. [28] 董聪, 姜波, 卢志刚, 等. 面向网络空间安全情报的知识图谱综述[J]. 信息安全学报, 2020, 5(5): 56-76. DONG C, JIANG B, LU Z G, et al. Knowledge graph for cyberspace security intelligence: a survey[J]. Journal of Cyber Security, 2020, 5(5): 56-76. [29] 丁兆云, 刘凯, 刘斌, 等. 网络安全知识图谱研究综述[J]. 华中科技大学学报(自然科学版), 2021, 49(7): 79-91. DING Z Y, LIU K, LIU B, et al. Survey of cyber security knowledge graph[J]. Journal of Huazhong University of Science and Technology (Natural Science Edition), 2021, 49(7): 79-91. [30] 尚文利, 朱鹏程, 王博文, 等. 面向威胁情报的知识图谱构建关键技术[J]. 自动化博览, 2023, 40(1): 15-19. SHANG W L, ZHU P C, WANG B W, et al. Key technologies for building knowledge graphs for threat intelligence[J]. Automation Panorama, 2023, 40(1): 15-19. [31] 王晓狄, 黄诚, 刘嘉勇. 面向网络安全开源情报的知识图谱研究综述[J]. 信息网络安全, 2023, 23(6): 11-21. WANG X D, HUANG C, LIU J Y. A survey of cyber security open-source intelligence knowledge graph[J]. Netinfo Security, 2023, 23(6): 11-21. [32] GAO C, ZHANG X, HAN M, et al. A review on cyber security named entity recognition[J]. Frontiers of Information Technology & Electronic Engineering, 2021, 22(9): 1153-1168. [33] LIU K, WANG F, DING Z, et al. Recent progress of using knowledge graph for cybersecurity[J]. Electronics, 2022, 11(15): 2287. [34] OLTRAMARI A, CRANOR L F, WALLS R J, et al. Computational ontology of network operations[C]//Proceedings of the 2015 IEEE Military Communications Conference, Tampa, Oct 26-28, 2015. Piscataway: IEEE, 2015: 318-323. [35] BEN-ASHER N, HUTCHINSON S, OLTRAMARI A. Characterizing network behavior features using a cyber-security ontology[C]//Proceedings of the 35th IEEE Military Communications Conference, Baltimore, Nov 1-3, 2016. Piscataway: IEEE, 2016: 758-763. [36] IANNACONE M, BOHN S, NAKAMURA G, et al. Developing an ontology for cyber security knowledge graphs[C]//Proceedings of the 10th Annual Cyber and Information Security Research Conference, Oak, Apr 7-9, 2015. New York: ACM, 2015: 1-4. [37] SYED Z, PADIA A, FININ T, et al. UCO: a unified cybersecurity ontology[C]//Proceedings of the Workshops of the 30th AAAI Conference on Artificial Intelligence: Artificial Intelligence for Cyber Security, Phoenix, Feb 12-17, 2016. Menlo Park: AAAI, 2016: 195-202. [38] KRUPKA G R. SRA: description of the SRA system as used for MUC-6[C]//Proceedings of the 6th Message Understanding Conference, Columbia, Nov 6-8, 1995. Stroudsburg: ACL, 1995: 221-236. [39] HUMPHREYS K, GAIZAUSKAS R, AZZAM S, et al. University of Sheffield: description of the LaSIE-II system as used for MUC-7[C]//Proceedings of the 7th Message Understanding Conference, Fairfax, Apr 29-May 1, 1998. Stroudsburg: ACL, 1998. [40] KRUPKA G R, HAUSMAN K. Description of the NetOwl extractor system as used for MUC-7[C]//Proceedings of the 7th Message Understanding Conference, Fairfax, Apr 29-May 1, 1998. Stroudsburg: ACL, 1998. [41] BLACK W J, RINALDI F, MOWATT D. Facile: description of the NE system used for MUC-7[C]//Proceedings of the 7th Message Understanding Conference, Fairfax, Apr 29-May 1, 1998. Stroudsburg: ACL, 1998. [42] APPELT D E, HOBBS J R, BEAR J, et al. FASTUS:a finite-state processor for information extraction from real-world text[C]//Proceedings of the 13th International Joint Conference on Artificial Intelligence, Chambery, Aug 28-Sep 3, 1993. San Francisco: Morgan Kaufmann Publishers Inc, 1993: 1172-1178. [43] MIKHEEV A, GROVER C, MOENS M. Description of the LTG system used for MUC-7[C]//Proceedings of the 7th Message Understanding Conference, Fairfax, Apr 29-May 1, 1998. Stroudsburg: ACL, 1998. [44] BIKEL D M, SCHWARTZ R, WEISCHEDEL R M. An algorithm that learns what's in a name[J]. Machine Learning, 1999, 34: 211-231. [45] ZHOU G, SU J. Named entity recognition using an HMM-based chunk tagger[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, Jul 6-12, 2002. Stroudsburg: ACL, 2002: 473-480. [46] BORTHWICK A, STERLING J, AGICHTEIN E, et al. NYU: description of the MENE named entity system as used in MUC-7[C]//Proceedings of the 7th Message Understanding Conference, Fairfax, Apr 29-May 1, 1998. Stroudsburg: ACL, 1998. [47] BENDER O, OCH F J, NEY H. Maximum entropy models for named entity recognition[C]//Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL, Edmonton, May 31-Jun 1, 2003. Stroudsburg: ACL, 2003: 148-151. [48] BRIDGES R A, JONES C L, IANNACONE M D, et al. Automatic labeling for entity extraction in cyber security[J]. arXiv:1308.4941, 2013. [49] JOSHI A, LAL R, FININ T, et al. Extracting cybersecurity related linked data from text[C]//Proceedings of the 7th IEEE International Conference on Semantic Computing, Irvine, Sep 16-18, 2013. Washington: IEEE Computer Society, 2014: 252-259. [50] 贾焰, 亓玉璐, 尚怀军, 等. 一种构建网络安全知识图谱的实用方法[J]. Engineering, 2018, 4(1): 117-133. JIA Y, QI Y L, SHANG H J, et al. A practical approach to constructing a knowledge graph for cybersecurity[J]. Engineering, 2018, 4(1): 117-133. [51] LIAO X, YUAN K, WANG X, et al. Acing the IOC game: toward automatic discovery and analysis of open-source cyber threat intelligence[C]//Proceedings of the 23rd ACM Conference on Computer and Communications Security, New York, Oct 24-28, 2016. New York: ACM, 2016: 755-766. [52] COLLOBERT R, WESTON J, BOTTOU L, et al. Natural language processing (almost) from scratch[J]. Journal of Machine Learning Research, 2011, 12: 2493-2537. [53] YAO L, LIU H, LIU Y, et al. Biomedical named entity recognition based on deep neutral network[J]. International Journal of Hybrid Information Technology, 2015, 8: 279-288. [54] STRUBELL E, VERGA P, BELANGER D, et al. Fast and accurate entity recognition with iterated dilated convolutions[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Sep 7-11, 2017. Stroudsburg: ACL, 2017: 2670-2680. [55] HUANG Z, WEI X, KAI Y. Bidirectional LSTM-CRF models for sequence tagging[J]. arXiv:1508.01991, 2015. [56] SIMRAN K, SRIRAM S, VINAYAKUMAR R, et al. Deep learning approach for intelligent named entity recognition of cyber security[C]//Proceedings of the 5th International Symposium on Signal Processing and Intelligent Recognition Systems, Trivandrum, Dec 18-21, 2019. Singapore: Springer, 2020: 163-172. [57] QIN Y, SHEN G, ZHAO W, et al. A network security entity recognition method based on feature template and CNN-BILSTM-CRF[J]. Frontiers of Information Technology and Electronic Engineering, 2019, 20(6): 872-884. [58] GAO C, ZHANG X, LIU H. Data and knowledge-driven named entity recognition for cyber security[J]. Cybersecurity, 2021, 4: 9. [59] LI T, HU Y, JU A, et al. Adversarial active learning for named entity recognition in cybersecurity[J]. Computers, Materials and Continua, 2021, 66(1): 407-420. [60] RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving language understanding by generative pre-training[Z]. 2018. [61] RADFORD A, WU J, CHILD R, et al. Language models are unsupervised multitask learners[J]. OpenAI Blog, 2019, 1(8): 9. [62] BROWN T B, MANN B, RYDER N, et al. Language models are few-shot learners[J]. arXiv:2005.14165, 2020. [63] OPENAI. GPT-4 technical report[J]. arXiv:2303.08774, 2023. [64] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[J]. arXiv:1810.04805, 2019. [65] YANG Z, DAI Z, YANG Y, et al. XLNet: generalized autoregressive pretraining for language understanding[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, Dec 8-14, 2019. Red Hook: Curran Associates Inc, 2019: 5753-5763. [66] JO H, LEE Y, SHIN S. Vulcan: automatic extraction and analysis of cyber threat intelligence from unstructured text[J]. Computers and Security, 2022, 120: 102763. [67] 杨秀璋, 彭国军, 李子川, 等. 基于Bert和BiLSTM-CRF的APT攻击实体识别及对齐研究[J]. 通信学报, 2022, 43(6): 58-70. YANG X Z, PENG G J, LI Z C, et al. Research on entity recognition and alignment of APT attack based on BERT and BiLSTM-CRF[J]. Journal on Communications, 2022, 43(6): 58-70. [68] ZHANG K, CHEN X, JING Y, et al. Research on named entity recognition method of network threat intelligence[C]//Proceedings of the 19th China Cyber Security Annual Conference, Beijing, Aug 16-17, 2022. Singapore: Springer Nature Singapore, 2022: 213-224. [69] ZHOU S, LIU J, ZHONG X, et al. Named entity recognition using BERT with whole world masking in cybersecurity domain[C]//Proceedings of the 6th IEEE International Conference on Big Data Analytics, Xiamen, Mar 5-8, 2021. New York: IEEE, 2021: 316-320. [70] 谢博, 申国伟, 郭春, 等. 基于残差空洞卷积神经网络的网络安全实体识别方法[J]. 网络与信息安全学报, 2020, 6(5): 126-138. XIE B, SHEN G W, GUO C, et al. Cyber security entity recognition method based on residual dilation convolution neural network[J]. Chinese Journal of Network and Information Security, 2020, 6(5): 126-138. [71] 苏剑林. Gplinker:基于globalpointer的实体关系联合抽取[EB/OL]. (2022-01-30) [2023-04-19]. https://spaces.ac.cn/archives/8888. SU J L. Gplinker: entity relationship joint extraction based on globalpointer[EB/OL]. (2022-01-30) [2023-04-19]. https://spaces.ac.cn/archives/8888. [72] YAN H, GUI T, DAI J, et al. A unified generative framework for various NER subtasks[C]//Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Aug 1-6, 2021. Stroudsburg: ACL, 2021: 5808-5822. [73] LI J, FEI H, LIU J, et al. Unified named entity recognition asword-word relation classification[C]//Proceedings of the 36th AAAI Conference on Artificial Intelligence, Vancouver, Feb 22-Mar 1,2022. Menlo Park: AAAI, 2022: 10965-10973. [74] WANG S, SUN X, LI X, et al. GPT-NER: named entity recognition via large language models[J]. arXiv:2304.10428, 2023. [75] LI X, ZHU X D, MA Z, et al. Are ChatGPT and GPT-4 general-purpose solvers for financial text analytics? An examination on several typical tasks[J]. arXiv:2305.05862, 2023. [76] WANG X, CHEN R, SONG B, et al. A method for extracting unstructured threat intelligence based on dictionary template and reinforcement learning[C]//Proceedings of the 24th IEEE International Conference on Computer Supported Cooperative Work in Design, Dalian, May 5-7, 2021. Piscataway: IEEE, 2021: 262-267. [77] NATIV Y T, SHALEV S. TheZoo aka malware DB[EB/OL]. (2023-04-04) [2023-04-19]. https://github.com/ytisf/theZoo. [78] 李瑞科, 刘元, 廖雷, 等. 1999—2018年安全漏洞数据集[EB/OL]. (2019-12-30) [2023-04-19]. http://www.csdata.org/p/315/. LI R K, LIU Y, LIAO L, et al. 1999—2018 security vulnerability dataset[EB/OL]. (2019-12-30) [2023-04-19]. http://www.csdata.org/p/315/. [79] WANG Y, SUN C, WU Y, et al. UniRE: a unified label space for entity relation extraction[C]//Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Aug, 2021. Stroudsburg: ACL, 2021: 220-231. [80] JACKADUMA. Secbert[EB/OL]. (2022-01-24) [2023-06-09]. https://huggingface.co/jackaduma/SecBERT. [81] BAYER M, KUEHN P, SHANEHSAZ R, et al. CySecBERT: a domain-adapted language model for the cybersecurity domain[J]. arXiv:2212.02974, 2022. [82] HOBBS J R. Resolving pronoun references[J]. Lingua, 1978, 44(4): 311-338. [83] GROSZ B J, WEINSTEIN S, JOSHI A K. Centering: a framework for modeling the local coherence of discourse[J]. Comput Linguist, 1995, 21(2): 203-225. [84] RAGHUNATHAN K, LEE H, RANGARAJAN S, et al. A multi-pass sieve for coreference resolution[C]//Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Cambridge, Oct 9-11, 2010. Stroudsburg: ACL, 2010: 492-501. [85] 陈远哲, 匡俊, 刘婷婷, 等. 共指消解技术综述[J]. 华东师范大学学报(自然科学版), 2019(5): 16-35. CHEN Y Z, KUANG J, LIU T T, et al. A survey on coreference resolution[J]. Journal of East China Normal University (Natural Science), 2019(5): 16-35. [86] RAHMAN A, NG V. Narrowing the modeling gap: a cluster-ranking approach to coreference resolution[J]. Journal of Artificial Intelligence Research, 2011, 40(1): 469-521. [87] MCCALLUM A, WELLNER B. Conditional models of identity uncertainty with application to noun coreference[C]//Proceedings of the 18th Annual Conference on Neural Information Processing Systems, Vancouver, Dec 1, 2004. Cambridge: MIT Press, 2004: 905-912. [88] 黄伟民. 基于预训练语言模型的中文共指消解方法研究[D]. 广州: 华南理工大学, 2021. HUANG W M. Research on Chinese coreference resolution based on pre-trained language model[D]. Guangzhou: South China University of Technology, 2021. [89] WISEMAN S, RUSH A M, SHIEBER S M. Learning global features for coreference resolution[C]//Proceedings of the 15th Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Austin, Nov 1-5, 2016. Stroudsburg: ACL, 2016: 994-1004. [90] LEE K, HE L, LEWIS M, et al. End-to-end neural coreference resolution[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Sep 7-11, 2017. Stroudsburg: ACL, 2017: 188-197. [91] LEE K, HE L, ZETTLEMOYER L. Higher-order coreference resolution with coarse-to-fine inference[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Louisiana, Jun 1-6, 2018. Stroudsburg: ACL, 2018: 687-692. [92] JOSHI M, LEVY O, ZETTLEMOYER L, et al. BERT for coreference resolution: baselines and analysis[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Hong Kong, China, Nov 3-7, 2019. Stroudsburg: ACL, 2019: 5803-5808. [93] JOSHI M, CHEN D Q, LIU Y H, et al. SpanBERT: improving pretraining by representing and predicting spans[J]. Transactions of the Association for Computational Linguistics, 2020, 8: 64-77. [94] 唐思宇, 李赛飞, 张丽杰. 基于neo4j的网络安全知识图谱构建分析[J]. 信息安全与通信保密, 2022(8): 60-70. TANG S Y, LI S F, ZHANG L J. Research on the construction of cyber security knowledge graph based on neo4j[J]. Information Security and Communications Privacy, 2022(8): 60-70. [95] 张晗, 胡永进, 郭渊博, 等. 信息安全领域内实体共指消解技术研究[J]. 通信学报, 2020, 41(2): 165-175. ZHANG H, HU Y J, GUO Y B, et al. Research on coreference resolution technology of entity in information security[J]. Journal on Communications, 2020, 41(2): 165-175. [96] 周宁, 靳高雅, 石雯茜. 融合神经网络与全局推理的实体共指消解算法[J]. 数据分析与知识发现, 2022, 6(8): 75-83. ZHOU N, JIN G Y, SHI W Q. Algorithm for entity coreference resolution withneural network and global reasoning[J]. Data Analysis and Knowledge Discovery, 2022, 6(8): 75-83. [97] LI Y, GUO Y, FANG C, et al. A novel threat intelligence information extraction system combining multiple models[J]. Security and Communication Networks, 2022: 8477260. [98] APPELT D E, HOBBS J R, BEAR J, et al. SRI international FASTUS system MUC-6 test results and analysis[C]//Proceedings of the 6th Message Understanding Conference, Columbia, Nov 6-8, 1995. Stroudsburg: ACL, 1995: 237-248. [99] GRISHMAN R. PROTEUS parser reference manual[Z]. PROTEUS Project Memorandum, 1986. [100] YANGARBER R, GRISHMAN R. NYU: description of the Proteus/PET system as used for MUC-7 ST[C]//Proceedings of the 7th Message Understanding Conference, Fairfax, Apr 29-May 1, 1998. Stroudsburg: ACL, 1998. [101] KAMBHATLA N. Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations[C]//Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Jul 21-26, 2004. Stroudsburg: ACL, 2004. [102] ZHOU G D, SU J, ZHANG J, et al. Exploring various knowledge in relation extraction[C]//Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, Ann Arbor, Jun 25-30, 2005. Stroudsburg: ACL, 2005: 427-434. [103] SUN X, DONG L. Feature-based approach to Chinese term relation extraction[C]//Proceedings of the 2019 International Conference on Signal Processing Systems, Singapore, May 15-17, 2009. Washington: IEEE Computer Society, 2009: 410-414. [104] JONES C L, BRIDGES R A, HUFFER K M T, et al. Towards a relation extraction framework for cyber-security concepts[C]//Proceedings of the 10th Annual Cyber and Information Security Research Conference, Oak Ridge, Apr 7-9, 2015. New York: ACM, 2015: 1-4. [105] BANKO M, CAFARELLA M J, SODERLAND S, et al. Open information extraction from the web[C]//Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad, Jan 6-12, 2007: 2670-2676. [106] 李涛. 威胁情报知识图谱构建与应用关键技术研究[D]. 郑州: 中国人民解放军战略支援部队信息工程大学, 2020. LI T. Research on key technologies for construction and application of threat intelligence knowledge graph[D]. Zhengzhou: PLA Strategic Support Force Information Engineering University, 2020. [107] PINGLE A, PIPLAI A, MITTAL S, et al. RelExt: relation extraction using deep learning approaches for cybersecurity knowledge graph improvement[C]//Proceedings of the 11th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Vancouver, Aug 27-30, 2019. New York: ACM, 2019: 879-886. [108] MITTAL S, DAS P K, MULWAD V, et al. CyberTwitter: using twitter to generate alerts for cybersecurity threats and vulnerabilities[C]//Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, San Francisco, Aug 18-21, 2016. Piscataway: IEEE, 2016: 860-867. [109] WU S, HE Y. Enriching pre-trained language model with entity information for relation classification[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, Nov 3-7, 2019. New York: ACM, 2019: 2361-2364. [110] SARHAN I, SPRUIT M. Open-CyKG: an open cyber threat intelligence knowledge graph[J]. Knowledge-Based Systems, 2021, 233: 107524. [111] SOARES L B, FITZGERALD N, LING J, et al. Matching the blanks: distributional similarity for relation learning[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Jul 28-Aug 2, 2019. Stroudsburg: ACL, 2019: 2895-2905. [112] WAN Z, CHENG F, MAO Z, et al. GPT-RE: in-context learning for relation extraction using large language models[C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, Dec 6-10, 2023. Stroudsburg: ACL, 2023:3534-3547. [113] LI Q, JI H. Incremental joint extraction of entity mentions and relations[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, Jun 22-27, 2014. Stroudsburg: ACL, 2014: 402-412. [114] MIWA M, SASAKI Y. Modeling joint entity and relation extraction with table representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Oct 25-29, 2014. Stroudsburg: ACL, 2014: 1858-1869. [115] MIWA M, BANSAL M. End-to-end relation extraction using LSTMs on sequences and tree structures[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Aug 7-12, 2016. Stroudsburg: ACL, 2016: 1105-1116. [116] 谢博. 基于深度学习的中文网络威胁情报信息抽取技术研究[D]. 贵阳: 贵州大学, 2022. XIE B. Research on information extraction technology of Chinese network threat intelligence based on deep learning[D]. Guiyang: Guizhou University, 2022. [117] WANG Y, YU B, ZHANG Y, et al. TPLinker: single-stage joint extraction of entities and relations through token pair linking[C]//Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Dec 8-13, 2020: 1572-1582. [118] YAN Z, ZHANG C, FU J, et al. A partition filter network for joint entity and relation extraction[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Punta Cana, Nov 7-11, 2021. Stroudsburg: ACL, 2021: 185-197. [119] MINTZ M, BILLS S, SNOW R, et al. Distant supervision for relation extraction without labeled data[C]//Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Singapore, Aug 2-7, 2009. Stroudsburg: ACL, 2009: 1003-1011. [120] RIEDEL S, YAO L, MCCALLUM A. Modeling relations and their mentions without labeled text[C]//Proceedings of the 2010 European Conference on Machine Learning and Knowledge Discovery in Databases, Barcelona, Sep 20-24, 2010. Berlin, Heidelberg: Springer, 2010: 148-163. [121] HOFFMANN R, ZHANG C, LING X, et al. Knowledge-based weak supervision for information extraction of overlapping relations[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, Jun 19-24, 2011. Stroudsburg: ACL, 2011: 541-550. [122] SURDEANU M, TIBSHIRANI J, NALLAPATI R, et al. Multi-instance multi-label learning for relation extraction[C]//Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jeju Island, Jul 12-14, 2012. Stroudsburg: ACL, 2012: 455-465. [123] ZENG D, LIU K, CHEN Y, et al. Distant supervision for relation extraction via piecewise convolutional neural networks[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Sep 17-22, 2015. Stroudsburg: ACL, 2015: 1753-1762. [124] 王会勇, 安康, 张晓明. 结合领域先验词汇的远程监督关系抽取模型[J]. 计算机应用与软件, 2022, 39(8): 34-43. WANG H Y, AN K, ZHANG X M. Distant supervision relation extraction model combined with domain priori words[J]. Computer Applications and Software, 2022, 39(8): 34-43. [125] SHEN G, WANG W, MU Q, et al. Data-driven cybersecurity knowledge graph construction for industrial control system security[J]. Wireless Communications and Mobile Computing, 2020: 1-13. [126] VASHISHTH S, JOSHI R, PRAYAGA S S, et al. Reside: improving distantly-supervised neural relation extraction using side information[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Oct 31-Nov 4, 2018. Stroudsburg: ACL, 2018: 1257-1266. [127] MOREIRA J, OLIVEIRA C, MACEDO D, et al. Distantly-supervised neural relation extraction with side information using BERT[C]//Proceedings of the 2020 International Joint Conference on Neural Networks Held as Part of the IEEE World Congress on Computational Intelligence, Glasgow, Jul 19-24, 2020. Piscataway: IEEE , 2020: 1-7. [128] ALT C, HUEBNER M, HENNIG L, et al. Fine-tuning pre-trained transformer language models to distantly supervised relation extraction[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Jul 28-Aug 2, 2019. Stroudsburg: ACL, 2019: 1388-1398. [129] CHRISTOU D, TSOUMAKAS G. Improving distantly-supervised relation extraction through BERT-based label and instance embeddings[J]. IEEE Access, 2021, 9: 62574-62582. [130] LI R, YANG C, LI T, et al. MiDTD: a simple and effective distillation framework for distantly supervised relation extraction[J]. ACM Transactions on Information Systems, 2022, 40(4): 1-32. [131] KIM G, LEE C, JO J, et al. Automatic extraction of named entities of cyber threats using a deep Bi-LSTM-CRF network[J]. International Journal of Machine Learning and Cybernetics, 2020, 11(10): 2341-2355. [132] LIM S K, MUIS A O, LU W, et al. MalwareTextDB: a database for annotated malware articles[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Jul 31-Aug 5, 2017. Stroudsburg: ACL, 2017: 1557-1567. |
[1] | 林穗, 卢超海, 姜文超, 林晓珊, 周蔚林. 融合选择注意力的小样本知识图谱补全模型[J]. 计算机科学与探索, 2024, 18(3): 646-658. |
[2] | 赵红磊, 唐焕玲, 张玉, 孙雪源, 鲁明羽. k-best维特比解耦合知识蒸馏的命名实体识别模型[J]. 计算机科学与探索, 2024, 18(3): 780-794. |
[3] | 张西硕, 柳林, 王海龙, 苏贵斌, 刘静. 知识图谱中实体关系抽取方法研究[J]. 计算机科学与探索, 2024, 18(3): 574-596. |
[4] | 陈加兴, 胡志伟, 李茹, 韩孝奇, 卢江, 闫智超. 融合描述信息和结构特征的知识图谱链接预测[J]. 计算机科学与探索, 2024, 18(2): 486-495. |
[5] | 唐瑞雪, 秦永彬, 陈艳平. 多尺寸注意力的命名实体识别方法[J]. 计算机科学与探索, 2024, 18(2): 506-515. |
[6] | 蒋洪迅, 张琳, 孙彩虹. 融合知识图谱的影视视频标签分类算法研究[J]. 计算机科学与探索, 2024, 18(1): 161-174. |
[7] | 崔焕庆, 宋玮情, 杨峻铸. 知识水波图卷积网络推荐模型[J]. 计算机科学与探索, 2023, 17(9): 2209-2218. |
[8] | 钱付兰, 王文学, 郑文杰, 陈洁, 赵姝. 基于层次保留的知识图谱嵌入链路预测方法[J]. 计算机科学与探索, 2023, 17(9): 2174-2183. |
[9] | 叶瀚, 孙海春, 李欣. 融合GCNN与GRU的异常实体识别方法[J]. 计算机科学与探索, 2023, 17(8): 1938-1948. |
[10] | 延照耀, 丁苍峰, 马乐荣, 曹璐, 游浩. 面向图神经网络的知识图谱嵌入研究进展[J]. 计算机科学与探索, 2023, 17(8): 1793-1813. |
[11] | 刘合兵, 张德梦, 熊蜀峰, 马新明, 席磊. 融合ALBERT与规则的小麦病虫害命名实体识别[J]. 计算机科学与探索, 2023, 17(6): 1395-1404. |
[12] | 彭晏飞, 张睿思, 王瑞华, 郭家隆. 少样本知识图谱补全技术研究[J]. 计算机科学与探索, 2023, 17(6): 1268-1284. |
[13] | 李智杰, 韩瑞瑞, 李昌华, 张颉, 石昊琦. 融合预训练模型和注意力的实体关系抽取方法[J]. 计算机科学与探索, 2023, 17(6): 1453-1462. |
[14] | 赵晔辉, 柳林, 王海龙, 韩海燕, 裴冬梅. 知识图谱推荐系统研究综述[J]. 计算机科学与探索, 2023, 17(4): 771-791. |
[15] | 韩虎, 郝俊, 张千锟, 孟甜甜. 知识增强的交互注意力方面级情感分析模型[J]. 计算机科学与探索, 2023, 17(3): 709-718. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||