计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (3): 591-597.DOI: 10.3778/j.issn.1673-9418.2009090

• 人工智能 • 上一篇    下一篇

融合多源信息的知识表示学习

夏光兵, 李瑞轩+(), 辜希武, 刘伟   

  1. 华中科技大学 计算机科学与技术学院,武汉 430074
  • 收稿日期:2020-09-29 修回日期:2021-05-10 出版日期:2022-03-01 发布日期:2021-05-26
  • 通讯作者: + E-mail: rxli@hust.edu.cn
  • 作者简介:夏光兵(1995—),男,湖北黄冈人,硕士,主要研究方向为知识表示学习。
    李瑞轩(1974—),男,湖北宜昌人,博士,教授,博士生导师,主要研究方向为大数据处理与分析、云计算与边缘计算、数据挖掘与机器学习。
    辜希武(1967—),男,湖北武汉人,博士,副研究员,硕士生导师,主要研究方向为大数据处理与分析,数据挖掘与机器学习。
    刘伟(1997—),男,湖北天门人,博士研究生, 主要研究方向为自然语言处理与机器学习。
  • 基金资助:
    国家重点研发计划(2016QY01W0202);国家自然科学基金(U1836204);国家自然科学基金(U1936108)

Knowledge Representation Learning Based on Multi-source Information Combination

XIA Guangbing, LI Ruixuan+(), GU Xiwu, LIU Wei   

  1. School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
  • Received:2020-09-29 Revised:2021-05-10 Online:2022-03-01 Published:2021-05-26
  • About author:XIA Guangbing, born in 1995, M.S. His research interest is knowledge representation learning.
    LI Ruixuan, born in 1974, Ph.D., professor, Ph.D. supervisor. His research interests include big data processing and analysis, cloud and edge computing, data mining and machine learning.
    GU Xiwu, born in 1967, Ph.D., associate pro-fessor, M.S. supervisor. His research interests include big data processing and analysis, data mining and machine learning.
    LIU Wei, born in 1997, Ph.D. candidate. His re-search interests include natural language process-ing and machine learning.
  • Supported by:
    National Key Research and Development Program of China(2016QY01W0202);National Natural Science Foundation of China(U1836204);National Natural Science Foundation of China(U1936108)

摘要:

在知识图谱中,实体的文本描述信息、实体的层次类型信息和图的拓扑结构信息中隐藏着丰富的内容,它们可以形成对原始三元组的有效补充,帮助提高知识图谱各种任务的效果。为了充分利用这些多源异质信息,首先通过一维卷积神经网络嵌入文本描述信息,然后根据实体的层次类型信息构建投影矩阵,将三元组中的实体向量和实体的描述向量映射到特定的关系空间中来约束实体的语义信息,再基于图注意力机制融合图的拓扑结构信息,计算不同邻接点对实体的影响。在图注意力层中,计算了实体间的多跳关系来帮助改善数据稀疏的问题。最后,通过二维卷积神经网络来捕获不同维度间的全局信息,进一步提高模型的性能。链接预测实验结果表明,基于多源信息组合的知识表示学习模型(MCKRL)能够充分利用三元组以外的多源异质信息,因而相比于其他基线模型,该模型在链接预测任务上取得了更好的结果。

关键词: 知识表示学习, 实体描述, 层次类型, 拓扑结构

Abstract:

In knowledge graphs, there are rich contents hidden in the text description information of entity, the hierarchical type information of entity and the topological structure information of graph, and they can form an effective supplement to the triple information to get better performance. In order to make full use of these hetero-geneous information, the convolutional neural networks are firstly used to encode entity description. Then a projection matrix is constructed according to hierarchical type information to project entity vectors and entity description vectors into specific relation space to constrain their semantic information. After that, the graph attention mechanism is introduced to fuse the topological structure information of graph and calculate the influence of different adjacency points on entities. Meanwhile, the multi-hop relationship information between entities is calcu-lated to further solve the problem of data sparsity. Finally, a decoder is employed to capture the global information between different dimensions. Experimental results of link prediction show that the multi-source information com-bined knowledge representation learning (MCKRL) model can make good use of multi-source heterogeneous information beyond triples, so it obtains better results than other baseline models in link prediction.

Key words: knowledge representation learning, entity description, hierarchical type, topological structure

中图分类号: