Rule-Based Classifier for Probabilistic Data

doi:10.3778/j.issn.1673-9418.1305006

Journal of Frontiers of Computer Science and Technology ›› 2013, Vol. 7 ›› Issue (7): 639-648.DOI: 10.3778/j.issn.1673-9418.1305006

Previous Articles Next Articles

Rule-Based Classifier for Probabilistic Data

ZHAO Tingting1,2, ZHAO Suyun1+, PEI Bin1,2,3, CHEN Hong1,2, LI Cuiping1,2

1. Key Laboratory of Data Engineering and Knowledge Engineering, Ministry of Education, Renmin University of China, Beijing 100872, China
2. School of Information, Renmin University of China, Beijing 100872, China
3. Computer Science Research Section, Army Officer Academy of PLA, Hefei 230031, China

Online:2013-07-01 Published:2013-07-02

概率数据上基于规则的分类器

赵婷婷1,2，赵素云1+，裴斌1,2,3，陈红1,2，李翠平1,2

1. 中国人民大学数据工程与知识工程教育部重点实验室，北京 100872
2. 中国人民大学信息学院，北京 100872
3. 解放军陆军军官学院计算机教研室，合肥 230031

Abstract

Abstract: Classification as an important problem in data mining is widely studied and applied nowadays, but the previous study is mainly about classification on certain data. Since probabilistic data exist and are widely used in many fields, such as sensor data, it is necessary to do feature selection for probabilistic databases. Firstly, this paper proposes a new probabilistic data model, which considers not only the randomness but also the similarity of different intervals. Secondly, in order to do classification for such probabilistic data, this paper designs a discernible distance to measure the distance between such tuples. Finally, this paper proposes a basic rule-based classification algorithm, and develops a new variable distance to reduce classification sensitivity to noise or perturbation. The Experimental results verify the effectiveness of the proposed algorithm.

Key words: classification, randomness, probabilistic data, discernible distance

摘要： 分类作为一类重要的数据挖掘问题被广泛地研究和应用，然而先前的研究主要针对确定数据上的分类问题，由于目前例如传感器等数据采集工具的普遍使用，概率数据广泛存在，在这类数据上进行分类研究十分必要。提出了一种新的概率数据模型，它既考虑了概率分布上的随机性，又包含了独立区间上的相似度；定义了一种新的辨识距离来衡量这类概率数据元组之间的距离；最后提出了概率数据上基于规则的分类算法，在基础分类算法上，引入了一种带有可变精度的分类算法来降低噪声或者扰动，提高了分类的精度。实验结果证明了该算法的有效性。

关键词: 分类, 随机性, 概率数据, 辨识距离

ZHAO Tingting, ZHAO Suyun, PEI Bin, CHEN Hong, LI Cuiping. Rule-Based Classifier for Probabilistic Data[J]. Journal of Frontiers of Computer Science and Technology, 2013, 7(7): 639-648.

赵婷婷, 赵素云, 裴斌, 陈红, 李翠平. 概率数据上基于规则的分类器[J]. 计算机科学与探索, 2013, 7(7): 639-648.

[1]	LIU Chao, LIANG Anting, LIU Xiaoyang, HUANG Xianying. Social Network Nodes Classification Method Based on Multi-information Fusion [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(9): 2198-2208.
[2]	CAO Yingli, DENG Zhaohong, HU Shudong, WANG Shitong. Classification of Alzheimer's Disease Integrating Individual Feature and Fusion Feature [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(7): 1658-1668.
[3]	LI Hao, WANG Shitong. Least Squares Support Vector Machine for Minimizing VC Dimensional Expectation Upper Bound [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(7): 1599-1608.
[4]	MENG Tiantian, HAN Hu, WU Yuanhang. Joint Modeling Based on Multi-task Learning for Aspect Term Extraction and Sen-timent Classification [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(7): 1669-1679.
[5]	CHEN Xiaolei, LU Yubing, CAO Baoning, LIN Dongmei. Lightweight and High-Precision Dual-Channel Attention Mechanism Module [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(4): 857-867.
[6]	WANG Tianhao, ZHANG Pei, ZHANG Zhao, CHEN Xihai, WANG Jing, ZHANG Baili. Multi-label Classification Based on Resampling and Ensemble Learning [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(4): 892-901.
[7]	XIA Hongbin, LI Qiang, LIU Yuan. Local and Global Feature Fusion Network Model for Aspect-Based Sentiment Analysis [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(4): 902-911.
[8]	AN Shengbiao, GUO Yuqi, BAI Yu, WANG Tengbo. Survey of Few-Shot Image Classification Research [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(3): 511-532.
[9]	WU Xin, XU Hong, LIN Zhuosheng, LI Shengke, LIU Huilin, FENG Yue. Review of Deep Learning in Classification of Tongue Image [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(2): 303-323.
[10]	WANG Yan, LYU Yanping. Hybrid Deep CNN-Attention for Hyperspectral Image Classification [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(2): 385-395.
[11]	LIAO Guoqiong, YANG Lechuan, WAN Changxuan, LIU Dexi, LIU Xiping. Attention-aware Next Event Recommendation Strategy for Groups [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(2): 499-510.
[12]	ZHAO Min, ZHANG Yueqin, DOU Yingtong, ZHANG Zehua. Imbalanced Fake Reviews?Detection with Ensemble Hierarchical Graph Attention Network [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(2): 428-441.
[13]	FU Kun, ZHUO Jiaming, GUO Yunpeng, LI Jianing, LIU Qi. Graph Convolutional Network with Adaptive Fusion of Neighborhood Aggregation and Interaction [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(2): 453-466.
[14]	SHE Yanhong, HUANG Wanli, HE Xiaoli, QIAN Ting. Incremental Feature Selection Oriented for Data with Hierarchical Structure [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(12): 2928-2941.
[15]	LI Dongmei, YANG Yu, MENG Xianghao, ZHANG Xiaoping, SONG Chao, ZHAO Yufeng. Review on Multi-lable Classification [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(11): 2529-2542.

Rule-Based Classifier for Probabilistic Data

概率数据上基于规则的分类器

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics