计算机科学与探索 ›› 2015, Vol. 9 ›› Issue (8): 906-913.DOI: 10.3778/j.issn.1673-9418.1411041

• 数据库技术 • 上一篇    下一篇

接收与处理分离的实时大数据处理模型

彭建华1,李臣明1,邱军林1,李晓芳2,徐立中1+   

  1. 1. 河海大学 计算机与信息学院,南京 210098
    2. 常州工学院,江苏 常州 213002
  • 出版日期:2015-08-01 发布日期:2015-08-06

Real-Time Big Data Processing Model Based on Receiving and Processing Separation

PENG Jianhua1, LI Chenming1, QIU Junlin1, LI Xiaofang2, XU Lizhong1+   

  1. 1. College of Computer and Information, Hohai University, Nanjing 210098, China
    2. Changzhou Institute of Technology, Changzhou, Jiangsu 213002, China
  • Online:2015-08-01 Published:2015-08-06

摘要: 在大数据处理过程中,系统必须有非常高的数据处理效率。为了满足对大数据实时、高效、稳定处理的需求,提出了一种接收与处理分离的数据处理模型。该数据处理模型由数据接收单元、内存数据库、原始数据分发单元、数据处理单元、处理数据分发单元、数据归并单元组成。接收单元负责接收、整合结构化数据与非结构化数据,把每条完整的数据放入内存数据库中;分发单元从内存数据库中检测获取数据,按照海量数据负载均衡算法把数据分发到数据处理单元;数据处理单元处理数据,处理结果放入内存数据库;处理数据分发单元继续从内存数据库中获取处理后的数据,并按照海量数据负载均衡算法把数据分发给数据归并单元。实验证明,使用该模型方法,系统保持了非常高效的处理效率。

关键词: 大数据, 负载均衡, 实时分析处理

Abstract: The system must have high data processing efficiency during processing big data. This paper designs a data processing model based on receiving and processing separation. Using the method the system can real-time, efficiently, stably process data. The model includes the data receiving unit, the memory database, the original data distribution unit, the data processing unit, the processed data distribution unit and the data merging unit. The receiving unit receives and integrates structure and unstructured data to a whole structure datum and puts it into the memory database. The original data distribution unit checks data from the memory database and dispatches the data to the data processing unit using the big data load balancing method. The data processing unit processes data and puts the result data into the memory database again. The processed data distribution unit gets data from the memory database and dispatches the data to the data merging unit using the big data load balancing method. The experimental results show that the system keeps with the high efficiency using the model.

Key words: big data, load balancing, real-time analysis and processing