计算机科学与探索

• 学术研究 •    下一篇

渐进式策略的多模态无监督实体对齐方法

马赫, 王海荣, 王艺焱, 孙崇, 周北京   

  1. 1. 北方民族大学 计算机科学与工程学院,银川 750021
    2. 北方民族大学 图像图形智能处理国家民委重点实验室, 银川 750021
  • 出版日期:2023-12-12 发布日期:2023-12-12

Multimodal Unsupervised Entity Alignment Approach with Progressive Strategies

MA He, WANG Hangrong, WANG Yiyan, SUN Chong, ZHOU Beijing   

  1. 1. School Computer Science and Engineering of the North Minzu university, University, Yinchuan 750021, China
    2. The Key Laboratory of Images & Graphics Intelligent Processing of State Ethnic Affairs Commission, Yinchuan 750021, China
  • Online:2023-12-12 Published:2023-12-12

摘要: 当前的实体对齐方法,虽然利用知识图谱中实体间的结构信息取得了不错的对齐效果,但是忽略了实体间包含的大量侧面信息。这些信息具有唯一性特征,可以用于增强对齐效果。本研究分析了实体侧面信息在实体对齐中的可用性,提出了一种无监督实体对齐方法,使用渐进式策略并融合图文信息。本方法通过融合实体的字面量信息和视觉信息,来增强实体的特征表示;采用双向阈值最近邻算法,过滤掉距离度量过高的实体对;引入渐进式策略,来动态增加相似度阈值,以控制对齐实体对的生成质量和生成速度;定义分配算法,以优化渐进式策略得到的结果。为了验证本文提出的方法,本方法在DBP15K数据集的ZH_EN、JA_EN、FR_EN子数据集上进行实验,并与PSR、EVA、DATTI等10种方法的结果进行了对比分析。实验结果表明,本文方法在ZH_EN和JA_EN子数据集的对齐任务上,Hits@1指标分别达到了95.7%和97.4%,在FR_EN上的Hits@10指标达到了99.0%,性能表现较佳。

关键词: 实体对齐, 无监督, 多模态, 渐进式策略, 分配问题

Abstract: Although the current entity alignment method utilizes the structural information between entities in the knowledge graph and achieves a good alignment effect, a large amount of side information contained between entities is ignored. This information has unique characteristics and can significantly enhance the alignment effect. The availability of entity profile information in entity alignment is analyzed, and an unsupervised entity alignment method fusing graphic and text information is proposed. This method enhances the feature representation of the entity by fusing the literal information and visual information of the entity; uses the two-way threshold nearest neighbor algorithm to filter out the entity pairs whose distance measurement is too high; uses a progressive strategy to dynamically increase the similarity threshold for Control the quality of the generation of aligned entity pairs and control their generation speed; define the results obtained by the allocation algorithm to optimize the progressive strategy. In order to verify the method proposed in this paper, experiments were carried out on ZH_EN, JA_EN, and FR_EN of the DBP15K data set, and the results of 10 methods such as PSR, EVA, and DATTI were compared and analyzed. Hits@1 on FR_EN is 95.7% and 97.4%, and Hits@10 on FR_EN is 99.0%, which shows better performance.

Key words: Entity alignment, Unsupervised, Multimodal, Progressive Strategy, Assignment problem