Journal of Frontiers of Computer Science and Technology ›› 2016, Vol. 10 ›› Issue (11): 1512-1523.DOI: 10.3778/j.issn.1673-9418.1509013

Previous Articles     Next Articles

Data Object Managing Method in Stream Processing System

WANG Jiaxing1+, LIN Xuelian1, SHEN Yang1, ZHANG Yun2, ZHANG Mingming1, MA Shuai1   

  1. 1. School of Computer Science and Engineering, Beihang University, Beijing 100191, China
    2. Shanghai General Recognition Technology Research Institute, Shanghai 201100, China
  • Online:2016-11-01 Published:2016-11-04


王家兴1+,林学练1,申  阳1,张  韵2,张明明1,马  帅1   

  1. 1. 北京航空航天大学 计算机学院,北京 100191
    2. 上海通用识别技术研究所,上海 201100

Abstract: In the practice of Internet of vehicles, people run several computing applications in streaming systems to   analyze GPS/OBD data collected from vehicles. These applications have some common requirements and features, which are long-cycle running, low processing delay, and the need for keeping states in memory. However, after a long time running, such kind of streaming job needs to keep a lot of computation parameters, status and other data in memory, and large numbers of data objects are not active among them. If let them occupy memory, it will cause a great waste of system resource. This paper proposes a data object managing method for stream tasks, and hopes to optimize the memory use of streaming system to solve the according problem. This paper establishes lifecycle model for streaming data object (SDO), and uses application-driven, data-driven method to achieve appropriate expire-parameters for SDO. Finally, this paper tests the proposed method in applications of Internet of vehicles. Experiments show that the proposed method can effectively reduce inactive SDO number and improve resource utilization while ensuring the processing delay can be accepted by users.

Key words: Internet of vehicles, streaming system, inactive object, lifecycle management, data-driven model

摘要: 在车联网的应用实践中,人们将分析车辆数据的任务运行在流式计算系统中。在运行分析中发现,这些任务具有运行周期长,处理延迟低,任务有状态等需求和特点,并且计算过程中需要在内存中保存大量计算参数和中间状态等数据对象,其中大量的数据对象并不活跃,任由其占用内存造成了系统资源的浪费。针对该问题开展研究,给出了流式任务的数据对象管理方法,优化了内存的使用。为流式数据对象建立生命周期模型,采用应用驱动、数据驱动的模型参数确定方法为流式数据对象设置合适的过期参数,设计车联网测试用例,验证该生命周期管理方法的有效性。实验结果表明,该方法在用户可接受的处理延迟范围内,能够有效地减少流式系统中不活跃对象的数目,达到了优化内存,降低资源开销的目的。

关键词: 车联网, 流式系统, 不活跃对象, 生命周期管理, 数据驱动模型