Journal of Frontiers of Computer Science and Technology

• Science Researches •     Next Articles

Heterogeneous Information Network Embedding Learning based on Attention: A Survey

TU Jiaqi, ZHANG Hua, CHANG Xiaojie, WANG Ji, YUAN Shuhong   

  1. Information Technology Center, Zhejiang University, Hangzhou  310058, China


屠佳琪, 张华, 常晓洁, 王佶, 袁书宏   

  1. 浙江大学 信息技术中心, 杭州 310058

Abstract: In recent years, graph embedding learning has become one of the most commonly used technologies in the field of information network analysis. It embeds network objects into low-dimensional dense vector spaces while capturing original network structure and content features. Then the learning embeddings are applied to downstream tasks such as object classification, clustering and link prediction, which has achieved remarkable results in the academic circle. However, many real-world networks are heterogeneous information networks (HINs) which are composed of multi-types objects, multi-types relationships between objects and enrich content features. Because of the heterogeneity of content, structure and semantics in HINs, traditional graph embedding learning cannot be directly applied to HINs. In order to learn more effective embedding, researchers began to pay attention to how to integrate attention mechanism into the embedding learning of HINs, so as to distinguish the influence of different content information, structural information and semantic information on final embedding. Therefore, this paper reviews the existing attention-integrated HIN embedding learning models. Firstly, it reviews the research process of HIN embedding in the past five years, and summarizes the unified general framework of attention-integrated model. Secondly, the existing work is classified according to the mode of attention integration, and the representative model is elaborated in detail. Then the commonly used datasets, benchmark and evaluation metrics are introduced. Finally, the future research direction of HINs embedding learning is summarized and discussed.

Key words: heterogeneous information network, graph embedding learning, attention mechanism, meta-path, graph neural network

摘要: 近年来,图嵌入学习已成为信息网络分析领域最常用的技术之一,其将网络对象嵌入到低维稠密向量空间的同时保留原始网络结构和内容特征并应用于对象分类、聚类及链接预测等下游任务,已在学术界取得显著成效。然而许多现实网络是由多种类型对象、对象之间的关系以及对象丰富内容特征所组成的异构信息网络(heterogeneous information network,HIN)。由于异构信息网络的内容异构性、结构异构性与语义异构性,因此无法将传统图嵌入学习直接应用于异构信息网络。为了学习更有效的嵌入表达,研究者开始关心如何将注意力机制融入到异构信息网络嵌入学习中,用以区分不同内容信息、结构信息和语义信息对嵌入表达的影响程度。故对现有融合注意力的异构信息网络嵌入模型进行综述,首先全面回顾异构信息网络嵌入在过去五年的研究历程,并总结概括出一种通用的融合模型框架;其次根据注意力融合方式对现有工作进行分类,并详细阐述代表性模型;然后介绍常用的数据集、基准平台工具和评测指标;最后总结和探讨异构信息网络嵌入学习未来的研究方向。

关键词: 异构信息网络, 图嵌入学习, 注意力机制, 元路径, 图神经网络