时序知识图谱的增量构建

doi:10.3778/j.issn.1673-9418.2009068

计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (3): 598-607.DOI: 10.3778/j.issn.1673-9418.2009068

时序知识图谱的增量构建

张子辰, 岳昆⁺(), 祁志卫, 段亮

云南大学信息学院, 昆明 650500

收稿日期:2020-09-24 修回日期:2020-11-20 出版日期:2022-03-01 发布日期:2020-12-08
通讯作者: + E-mail: kyue@ynu.edu.cn
作者简介:张子辰（1996—）,男,云南昆明人,硕士研究生,主要研究方向为数据与知识工程。
岳昆（1979—）,男,云南曲靖人,博士,教授,博士生导师,CCF高级会员,主要研究方向为海量数据分析与服务、数据与知识工程。
祁志卫（1987—）,男,山西神池人,博士研究生,讲师,主要研究方向为海量数据分析与不确定性人工智能。
段亮（1986—）,男,云南临沧人,博士,主要研究方向为社交网络分析、无监督学习。
基金资助:
国家自然科学基金云南联合基金(U1802271);国家自然科学基金(62002311);云南省基础研究计划杰出青年项目(2019FJ011);云南省万人计划“青年拔尖人才”计划(C6193032);云南大学“东陆学者”支持计划：中国博士后科学基金(2020M673310)

Incremental Construction of Time-Series Knowledge Graph

ZHANG Zichen, YUE Kun⁺(), QI Zhiwei, DUAN Liang

School of Information Science & Engineering, Yunnan University, Kunming 650500, China

Received:2020-09-24 Revised:2020-11-20 Online:2022-03-01 Published:2020-12-08
About author:ZHANG Zichen, born in 1996, M.S. candi-date. His research interests include data and knowledge engineering.
YUE Kun, born in 1979, Ph.D., professor, Ph.D. supervisor, senior member of CCF. His research interests include massive data analysis and ser-vice, data and knowledge engineering.
QI Zhiwei, born in 1987, Ph.D. candidate, lec-turer. His research interests include massive data analysis and uncertainty in artificial intelligence.
DUAN Liang, born in 1986, Ph.D. His research interests include social network data analysis, unsupervised learning.
Supported by:
Yunnan Joint Funds of National Natural Science Foundation of China(U1802271);National Natural Science Foundation of China(62002311);Science Foundation for Distinguished Young Scholars of Yunnan Province(2019FJ011);Fund for Distinguished Young Scholars of Yunnan Province(C6193032);Cultivation Project of Donglu Scholar of Yunnan University and the China Postdoctoral Science Foundation(2020M673310)

摘要/Abstract

摘要：

带有时序特征的知识图谱（KG）称为时序知识图谱,用来描述知识库中增量式的概念及其相互关系。知识随着时间推移而变化,将新增知识实时、准确地添加到时序知识图谱中,可以实时反映知识的演化更新。对此,给出时序知识图谱的定义,并基于TransH提出一种时序知识图谱的增量构建方法。为了将新增且相关的三元组准确地添加到当前知识图谱中,提出了三元组与当前知识图谱之间吻合度的计算模型,以及基于贪心思想的待添加到知识图谱中的最优三元组子集提取算法,进而将最优的三元组集合添加到当前知识图谱中,完成时序知识图谱的增量更新。实验结果表明,提出的增量构建方法能够快速地提取出最优三元组并有效地添加到知识图谱中,验证了方法的高效性和有效性。

关键词: 时序知识图谱, 吻合度, 增量构建, 贪心算法

Abstract:

Knowledge graph (KG) with time-series feature is referred to as time-series KG, which depicts the incre-mental concepts and corresponding relations in knowledge base. In view of knowledge being dramatically changing, by adding new knowledge to time-series KG, the evolution and update of knowledge can be reflected in time. Thus, this paper gives the definition of time-series KG and proposes the method for its incremental construction model based on TransH. In order to add new and relevant triple set to time-series KG, this paper proposes a model for calculating the coincidence between the triple and the current KG, and the technique for extracting the optimal triples by the idea of greedy algorithm. Then, the optimal set of triples is added to the time-series KG and the incremental update is fulfilled. Experimental results show that optimal triples can be extracted efficiently and added into the time-series KG by the proposed method. The effectiveness and efficiency of the method are verified.

Key words: time-series knowledge graph, coincidence, incremental construction, greedy algorithm

中图分类号:

TP391

张子辰, 岳昆, 祁志卫, 段亮. 时序知识图谱的增量构建[J]. 计算机科学与探索, 2022, 16(3): 598-607.

ZHANG Zichen, YUE Kun, QI Zhiwei, DUAN Liang. Incremental Construction of Time-Series Knowledge Graph[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(3): 598-607.

图/表 12

图1 KG添加新三元组实例

Fig.1 Example of adding new triple to KG

图2 实体与关系向量空间

Fig.2 Vector space of entity and relation

表1 数据集

Table 1 Datasets

数据集	实体数量	关系数量	三元组数量
Wikidata	2 785 340	963	7 816 262
CN-DBpedia	23 385 784	377 911	65 001 293
Freebase	114 277 828	13 690	605 473 035
FB50K	50 013	195	654 235
FB500K	502 394	468	4 180 072

图3 不同数据集下的 G i增量构建时间

Fig.3 Execution time of incremental construction of G i with different datasets

图4 不同迭代次数下 G i + 1的构建时间

Fig.4 Execution time of G i + 1 construction under different iteration times

图5 不同阈值下的提取结果

Fig.5 Extraction results under different thresholds

图6 不同方法的提取结果

Fig.6 Extraction results of different methods

图7 不同占比下的提取结果

Fig.7 Extraction results under different proportions

表2 不同数据集下的提取结果

Table 2 Extraction results under different datasets

数据集	三元组个数/ $10^{5}$	$P$	$R$	$F 1$
Wikidata	1	0.796	0.756	0.775
	4	0.819	0.789	0.804
	7	0.810	0.782	0.795
	10	0.807	0.799	0.803
CN-DBpedia	10	0.745	0.645	0.692
	40	0.764	0.662	0.710
	70	0.751	0.690	0.719
	100	0.735	0.610	0.667
Freebase	100	0.684	0.555	0.613
	400	0.687	0.557	0.616
	700	0.685	0.604	0.642
	1 000	0.651	0.573	0.610

表2 不同数据集下的提取结果

Table 2 Extraction results under different datasets

数据集	三元组个数/ $10^{5}$	$P$	$R$	$F 1$
Wikidata	1	0.796	0.756	0.775
	4	0.819	0.789	0.804
	7	0.810	0.782	0.795
	10	0.807	0.799	0.803
CN-DBpedia	10	0.745	0.645	0.692
	40	0.764	0.662	0.710
	70	0.751	0.690	0.719
	100	0.735	0.610	0.667
Freebase	100	0.684	0.555	0.613
	400	0.687	0.557	0.616
	700	0.685	0.604	0.642
	1 000	0.651	0.573	0.610

表3 不同数据集下的增量构建结果

Table 3 Results of incremental construction under different datasets

数据集	新增三元组	新增实体	新增关系
Wikidata	807 693	219 312	46
CN-DBpedia	7 352 195	2 490 193	728
Freebase	65 113 019	7 148 953	193

表4 开放世界下实体预测结果

Table 4 Open-world entity prediction results

Methods	FB50K				FB500K
	Head		Tail		Head		Tail
	MR	HITS@50	MR	HITS@50	MR	HITS@50	MR	HITS@50
TransE（OW）	4 529	0.19	3 584	0.21	23 107	0.07	20 925	0.07
TransH（OW）	3 962	0.18	3 423	0.21	18 419	0.11	19 308	0.09
TransD（OW）	3 710	0.21	3 081	0.22	18 014	0.13	18 724	0.13
增量构建	2 011	0.31	1 819	0.32	7 731	0.14	9 215	0.11

表5 封闭世界下实体预测结果

Table 5 Closed-world entity prediction results

Methods	FB50K				FB500K
	Head		Tail		Head		Tail
	MR	HITS@10	MR	HITS@10	MR	HITS@10	MR	HITS@10
TransE	2 921	0.25	2 584	0.27	10 192	0.16	11 549	0.12
TransH	2 714	0.28	2 423	0.16	9 607	0.16	10 831	0.09
TransD	2 639	0.28	1 732	0.29	9 148	0.14	9 532	0.13
增量构建	1 593	0.31	2 051	0.22	8 712	0.18	9 215	0.09

参考文献 22

[1]	刘峤, 李杨, 段宏, 等. 知识图谱构建技术综述[J]. 计算机研究与发展, 2016, 53(3): 582-600.
	LIU Q, LI Y, DUAN H, et al. Knowledge graph const-ruction techniques[J]. Journal of Computer Research and Development, 2016, 53(3): 582-600.
[2]	麻友, 岳昆, 张子辰, 等. 基于知识图谱和LDA模型的社会媒体数据抽取[J]. 华东师范大学学报(自然科学版), 2018(5): 183-194.
	MA Y, YUE K, ZHANG Z C, et al. Extraction of social media data based on the knowledge graph and LDA model[J]. Journal of East China Normal University (Natural Science), 2018(5): 183-194.
[3]	QIAN J W, LI X Y, ZHANG C H, et al. Social network de-anonymization and privacy inference with knowledge graph model[J]. IEEE Transactions on Dependable and Secure Computing, 2019, 16(4): 679-692. DOI URL
[4]	ZHANG J, TAN L, TAO X H, et al. Learning relational fractals for deep knowledge graph embedding in online social networks[C]// LNCS 11881: Proceedings of the 20th International Conference on Web Information Systems Engineering, Hong Kong, China, Nov 26-30, 2019. Cham: Springer, 2019: 660-674.
[5]	祁志卫, 王笳辉, 岳昆, 等. 图嵌入方法与应用:研究综述[J]. 电子学报, 2020, 48(4): 808-818. DOI
	QI Z W, WANG J H, YUE K, et al. Methods and appli-cations of graph embedding: a survey[J]. Acta Electronica Sinica, 2020, 48(4): 808-818.
[6]	BORDES A, USUNIER N, GARCÍA-DURÁN A, et al. Trans-lating embeddings for modeling multi-relational data[C]// Proceedings of the 27th Annual Conference on Neural Infor-mation Processing Systems, Lake Tahoe, Dec 5-8, 2013. Red Hook: Curran Associates, 2013: 2787-2795.
[7]	WANG Z, ZHANG J W, FENG J L, et al. Knowledge graph embedding by translating on hyperplanes[C]// Proceedings of the 28th AAAI Conference on Artificial Intelligence, Québec City, Jul 27-31, 2014. Menlo Park: AAAI, 2014: 1112-1119.
[8]	JI G L, HE S Z, XU L H, et al. Knowledge graph embe-dding via dynamic mapping matrix[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Beijing, Jul 26-31, 2015. Str-oudsburg: ACL, 2015: 687-696.
[9]	刘知远, 孙茂松, 林衍凯, 等. 知识表示学习研究进展[J]. 计算机研究与发展, 2016, 53(2): 247-261.
	LIU Z Y, SUN M S, LIN Y K, et al. Knowledge rep-resentation learning: a review[J]. Journal of Computer Res-earch and Development, 2016, 53(2): 247-261.
[10]	WEN J F, LI J X, MAO Y Y, et al. On the representation and embedding of knowledge bases beyond binary relations[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence, New York, Jul 9-15, 2016. Menlo Park: AAAI, 2016: 1300-1307.
[11]	陈晓军, 向阳. STransH: 一种改进的基于翻译模型的知识表示模型[J]. 计算机科学, 2019, 46(9): 184-189.
	CHEN X J, XIANG Y. STransH: a revised translation-based model for knowledge representation[J]. Computer Science, 2019, 46(9): 184-189.
[12]	GOEL R, KAZEMI S M, BRUBAKER M, et al. Diach-ronic embedding for temporal knowledge graph completion[C]// Proceedings of the 34th AAAI Conference on Artifi-cial Intelligence, the 32nd Innovative Applications of Artifi-cial Intelligence Conference, the 10th AAAI Symposium on Educational Advances in Artificial Intelligence, New York, Feb 7-12, 2020. Menlo Park: AAAI, 2020: 3988-3995.
[13]	WANG Z H, LI X. Hybrid-TE: hybrid translation-based temporal knowledge graph embedding[C]// Proceedings of the 31st IEEE International Conference on Tools with Arti-ficial Intelligence, Portland, Nov 4-6, 2019. Piscataway: IEEE, 2019: 1446-1451.
[14]	LIU Y, HUA W, XIN K X, et al. Context-aware temporal knowledge graph embedding[C]// LNCS 11881: Proceedings of the 20th International Conference on Web Information Systems Engineering, Hong Kong, China, Nov 26-30, 2019. Cham: Springer, 2019: 583-598.
[15]	GOTTSCHALK S, DEMIDOVA E. EventKG: a multi-lingual event-centric temporal knowledge graph[C]// LNCS 10843: Proceedings of the 15th International Conference on Se-mantic Web, Heraklion, Jun 3-7, 2018. Cham: Springer, 2018: 272-287.
[16]	AMARI S. Backpropagation and stochastic gradient desc-ent method[J]. Neurocomputing, 1993, 5(3): 185-196. DOI URL
[17]	YU W T, MA X N, BAI L Y. Path-based knowledge graph completion combining reinforcement learning with soft rules[C]// Proceedings of the 15th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery, Kunming, Jul 20-22, 2019. Cham: Springer, 2019: 139-146.
[18]	BORREGO A, AYALA D, HERNÁNDEZ I, et al. Gene-rating rules to filter candidate triples for their correctness checking by knowledge graph completion techniques[C]// Proceedings of the 10th International Conference on Know-ledge Capture, Marina Del Rey, Nov 19-21, 2019. New York: ACM, 2019: 115-122.
[19]	WANG H, LI S Y, PAN R, et al. Incorporating graph att-ention mechanism into knowledge graph reasoning based on deep reinforcement learning[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Hong Kong, China, Nov 3-7, 2019. Stroudsburg: ACL, 2019: 2623-2631.
[20]	CHEN M Y, ZHANG W, CHEN Q, et al. Meta relational learning for few-shot link prediction in knowledge graphs[C]// Proceedings of the 2019 Conference on Empirical Me-thods in Natural Language Processing and the 9th Inter-national Joint Conference on Natural Language Processing, Hong Kong, China, Nov 3-7, 2019. Stroudsburg: ACL, 2019: 4216-4225.
[21]	SOCHER R, CHEN D Q, MANNING C D, et al. Rea-soning with neural tensor networks for knowledge base completion[C]// Proceedings of the 27th Annual Conference on Neural Information Processing Systems, Lake Tahoe, Dec 5-8, 2013. Red Hook: Curran Associates, 2013: 926-934.
[22]	SHI B X, WENINGER T. Open-world knowledge graph completion[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence, the 30th Innovative Applications of Artificial Intelligence, and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence, New Orleans, Feb 2-7, 2018. Menlo Park: AAAI, 2018: 1957-1964.

编辑推荐 0

Metrics

阅读次数

全文

816

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	207	35	0	574

来源	本网站	其他网站

次数	779	37
比例	95%	5%

摘要

1359

最新录用	在线预览	正式出版

125	0	1234

来源	本网站	其他网站

次数	1358	1
比例	100%	0%

时序知识图谱的增量构建

Incremental Construction of Time-Series Knowledge Graph

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 12

参考文献 22

相关文章 3

编辑推荐 0

Metrics

[1]	庞源, 武继刚, 陈龙, 姚棉阳. 边缘计算中多设备多任务的能耗均衡优化算法[J]. 计算机科学与探索, 2022, 16(2): 480-488.
[2]	曹义亲, 刘龙标, 何恬, 丁要男. 基于贪心选择及斜率探测扩充的轨面提取方法[J]. 计算机科学与探索, 2022, 16(1): 205-216.
[3]	孙焕良，富珊珊，刘俊岭，于戈，许鸿斐. 社会网络中弱关系团队形成问题研究[J]. 计算机科学与探索, 2016, 10(6): 773-785.