计算机科学与探索 ›› 2020, Vol. 14 ›› Issue (2): 236-243.DOI: 10.3778/j.issn.1673-9418.1903067

• 学术研究 • 上一篇    下一篇

动态模糊粗糙特征选取算法

倪鹏,刘阳明,赵素云,陈红,李翠平   

  1. 1. 中国人民大学 数据工程与知识工程教育部重点实验室,北京 100872
    2. 中国人民大学 信息学院,北京 100872
  • 出版日期:2020-02-01 发布日期:2020-02-16

Dynamic Fuzzy Rough Feature Selection Algorithm

NI Peng, LIU Yangming, ZHAO Suyun, CHEN Hong, LI Cuiping   

  1. 1. Key Laboratory of Data Engineering and Knowledge Engineering, Ministry of Education, Renmin University of China, Beijing 100872, China
    2. School of Information, Renmin University of China, Beijing 100872, China
  • Online:2020-02-01 Published:2020-02-16

摘要:

由于数据随时间和空间不断更新,很多基于粗糙集的增量方法被提出。然而,动态数据上基于模糊粗糙集的特征选取(也称属性约简)更新的研究较少,特别是连续型动态数据上的增量特征选取。为了解决这个问题,提出适用于连续型数据的基于模糊粗糙集的增量属性约简算法。首先提出模糊粗糙基本概念的增量机制,如模糊正域的增量机制。只有部分示例在已有属性约简上的辨识能力不足,即对于模糊正域来说,存在一个关键示例集。增量约简算法基于已有数据上的约简结果,仅需要更新关键示例集中的示例,而非全部的论域。因而该增量算法在动态数据上能快速获得约简的更新。通过数值对比实验可以看出,增量算法比非增量算法在运行时间上有明显的优势。特别是对于高维数据集,增量算法可以大大地节省计算时间。

关键词: 特征选择, 增量学习, 模糊粗糙集, 依赖度

Abstract:

Since data update over time and space constantly, many rough set based incremental techniques have been proposed. Whereas there is less work on fuzzy rough set based feature selection (i.e., attribute reduction) from the dynamic data, especially the continuous dynamic data. In order to address this problem, an incremental attribute reduction algorithm based on fuzzy rough set is proposed for continuous data. First, some incremental mechanisms on fuzzy rough set are proposed, such as the incremental mechanisms of fuzzy positive region. Only some instances have insufficient identification capabilities on existing attribute reduction. That is, for the fuzzy positive region, there exists a key instance set. The incremental reduction algorithm updates the reduction results on the existing data by only considering the instances in the key instance set, but not the entire universe. Therefore, the incremental algorithm can quickly obtain a reduction update on dynamic data. Finally, some numerical experiments demonstrate that the incremental algorithm is effective and efficient compared to non-incremental attribute reduction algorithms.The incremental algorithm can save computing time greatly, especially on the datasets with high dimension.

Key words: feature selection, incremental learning, fuzzy rough set, dependency function