计算机科学与探索 ›› 2013, Vol. 7 ›› Issue (5): 460-471.DOI: 10.3778/j.issn.1673-9418.1207010

• 学术研究 • 上一篇    下一篇

不确定性数据世系的时序多层概率图模型表示

朱运磊1,岳  昆1+,钱文华1,杨文静2,刘惟一1   

  1. 1. 云南大学 信息学院 计算机科学与工程系,昆明 650091
    2. 云南大学 滇池学院 计算机科学与通信工程系,昆明 650228
  • 出版日期:2013-05-01 发布日期:2013-05-03

Time-Series Multi-Level Probabilistic Graphical Model for Representing Lineages over Uncertain Data

ZHU Yunlei1, YUE Kun1+, QIAN Wenhua1, YANG Wenjing2, LIU Weiyi1   

  1. 1. Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming 650091, China
    2. Department of Computer Science and Communication Engineering, Dianchi College, Yunnan University, Kunming 650228, China
  • Online:2013-05-01 Published:2013-05-03

摘要: 不确定性数据世系分析需要追踪随时间推移数据产生和演化过程中不确定性的起源,为了有效地反映世系本身的时序特征和数据演化过程,并支持世系分析中的概率推理和不确定性追踪,针对不确定性数据查询处理的世系表示,以贝叶斯网这一重要的概率图模型作为不确定性知识表示的框架,并基于世系的时序性和层次性对其进行了扩展。以世系的布尔公式表达式为出发点,提出了涉及连续时间片的时序多层概率图模型的概念,给出了时间片内和连续时间片间贝叶斯网结构的构建方法,以及网络中各结点概率参数的计算方法,旨在为世系分析奠定模型基础。实验结果表明,该世系表示方法是有效、实用的。

关键词: 不确定性数据, 世系表示, 概率图模型, 贝叶斯网, 时序多层概率图模型

Abstract: Lineage analysis over uncertain data will trace the origin of uncertainty of data production and evolution with time passing. In order to reflect the inherent time-series property and the process of data evolution, and support probability inferences and uncertainty tracing in lineage analysis, this paper considers the lineages representation of query processing over uncertain data and adopts Bayesian network (BN), an important probabilistic graphical model (PGM), as the framework for uncertainty representation. Specifically, it extends BN by incorporating the time-series and multi-level properties. To provide the basis of lineage analysis models, this paper starts from the Boolean formulas for describing query processing, and proposes the concept of time-series multi-level probabilistic graphical model, then gives the corresponding method for constructing BN structures in separate time slices and those between adjacent time slices, as well as the method for computing probability parameters of nodes. Experimental results show that the proposed method for lineage representation is effective and applicable.

Key words: uncertain data, lineage representation, probabilistic graphical model (PGM), Bayesian network (BN), time-series multi-level probabilistic graphical model