关系tri-training：利用无标记数据学习一阶规则

doi:10.3778/j.issn.1673-9418.2012.05.005

计算机科学与探索 ›› 2012, Vol. 6 ›› Issue (5): 430-442.DOI: 10.3778/j.issn.1673-9418.2012.05.005

关系tri-training：利用无标记数据学习一阶规则

李艳娟1,2+，郭茂祖1

1. 哈尔滨工业大学计算机科学与技术学院，哈尔滨 150001
2. 东北林业大学信息与计算机工程学院，哈尔滨 150040

出版日期:2012-05-01 发布日期:2012-05-09

Relational-Tri-Training: Learning First-Order Rules Exploiting Unlabeled Data

LI Yanjuan1,2+, GUO Maozu1

1. School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
2. School of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China

Online:2012-05-01 Published:2012-05-09

摘要/Abstract

摘要： 针对目前归纳逻辑程序设计（inductive logic programming，ILP）系统要求训练数据充分且无法利用无标记数据的不足，提出了一种利用无标记数据学习一阶规则的算法——关系tri-training（relational-tri-training，R-tri-training）算法。该算法将基于命题逻辑表示的半监督学习算法tri-training的思想引入到基于一阶逻辑表示的ILP系统，在ILP框架下研究如何利用无标记样例信息辅助分类器训练。R-tri-training算法首先根据标记数据和背景知识初始化三个不同的ILP系统，然后迭代地用无标记样例对三个分类器进行精化，即如果两个分类器对一个无标记样例的标记结果一致，则在一定条件下该样例将被标记给另一个分类器作为新的训练样例。标准数据集上实验结果表明：R-tri-training能有效地利用无标记数据提高学习性能，且R-tri-training算法性能优于GILP（genetic inductive logic programming）、NFOIL、KFOIL和ALEPH。

关键词: 机器学习, 归纳逻辑程序设计（ILP）, 关系tri-training, 概率近似正确（PAC）可学习

Abstract: For the current inductive logic programming (ILP) system, the sufficient training datasets are required and the unlabeled data cannot be used. To solve this limitation, this paper introduces a first-order rule-learning algorithm exploiting the unlabeled data, named relational-tri-training (R-tri-training). This algorithm combines the tri-training based on propositional logic representation and ILP based on first-order logic representation, investigates the issue how to improve the performance of classifiers using the unlabeled data under the framework of ILP. Three different ILP systems are initialized according to the labeled data and the background knowledge, and then the three classifiers are refined by iteratively using the unlabeled data. That is, under special condition, the unlabeled data are going to be labeled to one classifier as the new training data when the same labeled results are given by the other two classifiers. Experimental results on the well-known benchmarks show that R-tri-training can effectively enhance the learning performance by exploiting the unlabeled data, and the performance of R-tri-training is better than genetic inductive logic programming (GILP), NFOIL, KFOIL and ALEPH.

Key words: machine learning, inductive logic programming (ILP), relational-tri-training, probability approximately correct (PAC) learning

李艳娟，郭茂祖. 关系tri-training：利用无标记数据学习一阶规则[J]. 计算机科学与探索, 2012, 6(5): 430-442.

LI Yanjuan, GUO Maozu. Relational-Tri-Training: Learning First-Order Rules Exploiting Unlabeled Data[J]. Journal of Frontiers of Computer Science and Technology, 2012, 6(5): 430-442.

[1]	杨悦, 王士同. 随机特征映射的四层神经网络及其增量学习[J]. 计算机科学与探索, 2021, 15(7): 1265-1278.
[2]	赵雪莉, 卢光跃, 吕少卿, 张潘. 结合属性信息的二分网络表示学习[J]. 计算机科学与探索, 2021, 15(3): 495-505.
[3]	马永杰, 徐小冬, 张茹, 谢艺蓉, 陈宏. 生成式对抗网络及其在图像生成中的研究进展[J]. 计算机科学与探索, 2021, 15(10): 1795-1811.
[4]	宋雨萌，谷峪，李芳芳，于戈. 人工智能赋能的查询处理与优化新技术研究综述[J]. 计算机科学与探索, 2020, 14(7): 1081-1103.
[5]	马毓敏，王士同. 最大化AUC的正例未标注分类及其增量算法[J]. 计算机科学与探索, 2020, 14(11): 1879-1887.
[6]	梁俊杰，韦舰晶，蒋正锋. 生成对抗网络GAN综述[J]. 计算机科学与探索, 2020, 14(1): 1-17.
[7]	孙涛，周志华. 近似多元信息多样性[J]. 计算机科学与探索, 2019, 13(4): 639-646.
[8]	龙廷艳，万良，丁红卫. 自编码网络在JavaScript恶意代码检测中的应用研究[J]. 计算机科学与探索, 2019, 13(12): 2073-2084.
[9]	丁毅，王明亮，张道强. 差异性随机子空间集成[J]. 计算机科学与探索, 2018, 12(9): 1434-1443.
[10]	张贤贤，王浩宇，郭耀，徐国爱. 基于众包和机器学习的移动应用隐私评级研究[J]. 计算机科学与探索, 2018, 12(8): 1238-1251.
[11]	王建飞，亢良伊，刘杰，叶丹. 分布式随机方差消减梯度下降算法topkSVRG[J]. 计算机科学与探索, 2018, 12(7): 1047-1054.
[12]	李盼，赵文涛，刘强，崔建京，殷建平. 机器学习安全性问题及其防御技术研究综述[J]. 计算机科学与探索, 2018, 12(2): 171-184.
[13]	王蒙湘，李芳芳，谷峪，于戈. 交互式数据探索综述[J]. 计算机科学与探索, 2017, 11(2): 171-184.
[14]	陈茜，史殿习，杨若松. 多维数据特征融合的用户情绪识别[J]. 计算机科学与探索, 2016, 10(6): 751-760.
[15]	沈琰辉，刘华文，徐晓丹，赵建民，陈中育. 基于邻域离散度的异常点检测算法[J]. 计算机科学与探索, 2016, 10(12): 1763-1772.

关系tri-training：利用无标记数据学习一阶规则

Relational-Tri-Training: Learning First-Order Rules Exploiting Unlabeled Data

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics