计算机科学与探索 ›› 2024, Vol. 18 ›› Issue (2): 279-300.DOI: 10.3778/j.issn.1673-9418.2304081

• 前沿·综述 • 上一篇    下一篇

工业互联网安全知识图谱构建研究综述

常钰,王钢,朱鹏,孔令飞,何京恒   

  1. 内蒙古工业大学 数据科学与应用学院,呼和浩特 010000
  • 出版日期:2024-02-01 发布日期:2024-02-01

Survey of Research on Construction Method of Industry Internet Security Knowledge Graph

CHANG Yu, WANG Gang, ZHU Peng, KONG Lingfei, HE Jingheng   

  1. School of Data Science and Application, Inner Mongolia University of Technology, Hohhot 010000, China
  • Online:2024-02-01 Published:2024-02-01

摘要: 工业互联网安全知识图谱能够在丰富安全概念语义关系、提高安全知识库质量和增强安全态势可视化分析能力等方面发挥重要作用,已经成为认知、溯源和防护针对新能源工业控制系统攻击的关键。但是,与通用领域知识图谱构建相比,工业互联网安全知识图谱构建的各个环节仍然存在许多问题,影响了其实际应用效果。介绍了工业互联网安全知识图谱的概念、意义和其与通用知识图谱的区别;概括了工业互联网安全知识图谱本体构建的相关工作及其作用;重点研究了在工业互联网安全背景下,构建知识图谱的三个关键环节,即命名实体识别、关系抽取和共指消解的相关工作。对于每个环节,详细报告了该环节在领域背景下的发展历史和研究现状,深入分析了该环节面临的领域特有挑战,如非连续实体识别问题、候选词提取问题和缺乏领域高质量数据集等,并针对特有挑战展望了该环节未来的研究方向,为进一步提升工业互联网安全知识图谱的质量和实用性,从而更有效地应对新兴威胁和攻击提供借鉴和启示。

关键词: 工业互联网安全, 知识图谱, 命名实体识别, 关系抽取, 共指消解

Abstract: The industry Internet security knowledge graph plays an important role in enriching the semantic relationships of security concepts, improving the quality of the security knowledge base, and enhancing the ability to visualize and analyze the security situation. It has become the key to recognize, trace and protect against the attacks targeting new energy industry control systems. However, compared with the construction of the general domain knowledge graph, there are still many problems in each stage of the construction of the industry Internet security knowledge graph, which affect its practical application effect. This paper introduces the concept and significance of the industry Internet security knowledge graph and its difference from the general knowledge graph, summarizes the related work and role of the ontology construction of industry Internet security knowledge graph. Under the background of industry Internet security, it focuses on the related work of the three important components of knowledge graph construction, respectively named entity recognition, relationship extraction and reference resolution. For each component, it detailedly reports on the development history and research status of this component in the domain, and deeply analyses the domain challenges in this component, such as non-continuous entity recognition, candidate word extraction, the lack of domain-quality datasets and so on. It predicts the future research directions of this component, provides reference and enlightenment to further improve the quality and usefulness of industry Internet security knowledge graph, so as to deal with emerging threats and attacks more effectively.

Key words: industry Internet security, knowledge graph, named entity recognition, relation extraction, coreference resolution