计算机科学与探索

• 学术研究 •    

RFID数据清洗技术研究进展

王健,乐嘉锦   

  1. 1.河南财经政法大学 计算机与信息工程学院,郑州 450046
    2.东华大学 计算机科学与技术学院,上海 201620

Research progress of RFID data cleaning technology

WANG Jian, LE Jiajin   

  1. 1.School of Computer and Information Engineering, Henan University of Economics and Law, Zhengzhou 450046, China
    2.School of Computer Science and Technology, Donghua University, Shanghai 201620, China

摘要: 无线射频识别(Radio Frequency Identification,RFID)技术是一种自动识别方法,它依赖于称为RFID标签的无线电转发器快速存储和检索数据。随着物联网时代的到来,RFID技术开始广泛应用于人类的日常生活中,比如零售、供应链管理、物品跟踪等。由于RFID标签与读写器通信时无需直接接触,这样为短时间内采集大量的数据提供了可能。同时,采集到的数据也产生了诸如漏读、多读、冗余、乱序等问题,加之其具有的产生速度快、规模大、时效性强等特点,如何在短时间内高效地清洗产生的大规模RFID数据成为数据库领域的重要研究课题。目前,研究者们也提出了大量的RFID数据清洗技术,为RFID数据的预处理与应用提供了便利。本文主要对现有的RFID数据清洗技术进行了综述。首先介绍了RFID系统与数据清洗的问题描述,再次分析了相关研究挑战,接着整理了相关数据集与评价标准,然后从漏读数据清洗、多读数据清洗、冗余数据清洗、乱序数据处理、RFID系统应用等方面对现有的RFID数据清洗技术进行了详细的比较、归纳和总结,最后对RFID数据清洗问题上可能的研究方向进行了展望,为相关研究提供参考。

关键词: RFID, 数据清洗, 漏读数据, 多读数据, 冗余数据, 乱序数据, 系统应用

Abstract: Radio frequency identification (RFID) technology is an automatic identification method, which relies on the use of radio repeaters called RFID tags to quickly store and retrieve data. With the advent of the Internet of things era, RFID technology has been widely used in human daily life, such as retail, supply chain management, item tracking and so on. Because RFID tags do not need linear contact when communicating with readers, it is possible to collect a large amount of data in a short time. At the same time, the collected data also produce problems such as false negative readings, false positive readings, duplicated readings, out-of-order readings and so on. Meanwhile, it has the characteristics of fast production speed, large scale and high timeliness. In this case, how to efficiently clean the large-scale RFID data in short time has become an important research topic in the field of database. At present, researchers have also proposed a large number of RFID data cleaning technologies, which provides convenience for the preprocessing and application of RFID data. This paper mainly summarizes the existing RFID data cleaning technology. Firstly, it introduces the RFID system and definitions of RFID cleaning problem in detail. Secondly, it analyzes the relevant research challenges. Thirdly, it gives the relevant data sets and evaluation metrics. Forthly, from the aspects of false negative reading cleaning, false positive reading cleaning, duplicated reading cleaning, out-of-order reading processing, RFID system application, it combs the existing RFID data cleaning technology. Finally, it presents the possible research directions of RFID data cleaning for relevant research.

Key words: RFID, data cleaning, false negative reading, false positive reading, duplicated reading, out-of-order reading, system application