Journal of Frontiers of Computer Science and Technology ›› 2025, Vol. 19 ›› Issue (1): 245-252.DOI: 10.3778/j.issn.1673-9418.2310100

• Artificial Intelligence·Pattern Recognition • Previous Articles     Next Articles

Multimodal Unsupervised Entity Alignment Approach with Progressive Strategies

MA He, WANG Hairong, WANG Yiyan, SUN Chong, ZHOU Beijing   

  1. 1. School of Computer Science and Engineering, North Minzu University,  Yinchuan 750021, China
    2. The Key Laboratory of Images & Graphics Intelligent Processing of State Ethnic Affairs Commission, Yinchuan 750021, China
  • Online:2025-01-01 Published:2024-12-31

渐进式策略的多模态无监督实体对齐方法

马赫,王海荣,王艺焱,孙崇,周北京   

  1. 1. 北方民族大学 计算机科学与工程学院,银川 750021
    2. 北方民族大学 图像图形智能处理国家民委重点实验室,银川 750021

Abstract: Although the current entity alignment method utilizes the structural information between entities in the knowledge graph and achieves a good alignment effect, a large amount of side information contained between entities is ignored. This information has unique characteristics and can significantly enhance the alignment effect. The availability of entity profile information in entity alignment is analyzed, and an unsupervised entity alignment method fusing graphic and text information is proposed. This method enhances the feature representation of the entity by fusing the literal information and visual information of the entity; uses the two-way threshold nearest neighbor algorithm to filter out the entity pairs whose distance measurement is too high; uses a progressive strategy to dynamically increase the similarity threshold for controlling the quality of the generation of aligned entity pairs and their generation speed; defines the results obtained by the allocation algorithm to optimize the progressive strategy. To validate the method proposed in this paper, experiments are conducted on three sub-datasets of the DBP15K dataset, i.e. ZH_EN, JA_EN, and FR_EN. The results are compared with 10 methods including PSR, EVA, and DATTI. Experimental results show that the Hits@1 indicators reach 95.7% and 97.4% respectively on the ZH_EN and JA_EN datasets, and the Hits@10 indicator reaches 99.9% on the FR_EN dataset, showing excellent performance of the proposed method.

Key words: entity alignment, unsupervised, multimodal, progressive strategy, assignment problem

摘要: 当前的实体对齐方法,虽然利用知识图谱中实体间的结构信息取得了不错的对齐效果,但是忽略了实体间包含的大量侧面信息。这些信息具有唯一性特征,可以用于增强对齐效果。分析了实体侧面信息在实体对齐中的可用性,提出了一种无监督实体对齐方法,使用渐进式策略并融合图文信息。该方法通过融合实体的字面量信息和视觉信息,来增强实体的特征表示;采用双向阈值最近邻算法,过滤掉距离度量过高的实体对;引入渐进式策略,来动态增加相似度阈值,以控制对齐实体对的生成质量和生成速度;定义分配算法,以优化渐进式策略得到的结果。为了验证提出的方法,在DBP15K数据集的ZH_EN、JA_EN、FR_EN子数据集上进行实验,并与PSR、EVA、DATTI等10种方法的结果进行了对比分析。实验结果表明,该方法在ZH_EN和JA_EN子数据集的对齐任务上,Hits@1指标分别达到了95.7%和97.4%,在FR_EN上Hits@10指标达到了99.9%,性能表现较佳。

关键词: 实体对齐, 无监督, 多模态, 渐进式策略, 分配问题