Journal of Frontiers of Computer Science and Technology ›› 2014, Vol. 8 ›› Issue (4): 456-466.DOI: 10.3778/j.issn.1673-9418.1307020

Previous Articles     Next Articles

Hidden Variable Discovering Algorithm Based on Local Causality Analysis

YAO Hongliang+ , WU Lihui, WANG Hao, LI Junzhao   

  1. School of Computer and Information, Hefei University of Technology, Hefei 230009, China
  • Online:2014-04-01 Published:2014-04-03

局部因果关系分析的隐变量发现算法

姚宏亮+,吴立辉,王  浩,李俊照   

  1. 合肥工业大学 计算机与信息学院,合肥 230009

Abstract: Hidden variable discovering algorithm of structural analysis is difficult to discover hidden variables effectively and possesses?poor interpretability. Based on the causality and the uncertainty of local structure, this paper presents a hidden variable discovering algorithm based on local causality analysis (LCAHD). LCAHD algorithm introduces the definition of causal structure entropy, which integrates causal knowledge and uncertainty knowledge, regards the uncertainty of causality as the judgment of the existence of hidden variables, and proves the judgment theoretically. Firstly, LCAHD algorithm obtains the Markov blanket of interested variable to extract the local dependency structure, then utilizes interventional to generate interventional data, and joints interventional data and observational data to study local causality in the local dependency structure. Secondly, it utilizes causal structure entropy to measure the uncertainty of causality in the local causal structure, and utilizes the judging criteria of hidden variables and the uncertainty of causality to determine the existence of hidden variables. This paper carries on experiments on the standard network and stock network respectively. The experimental results show that this algorithm can effectively determine the location of hidden variables with strong interpretability.

Key words: hidden variables, Markov blanket, intervention learning, causality analysis, causal structure entropy

摘要: 结构分析的隐变量发现方法难以有效地发现隐变量且可解释性较差。基于因果关系和局部结构的不确定性,提出了一种基于局部因果关系分析的隐变量发现算法(hidden variable discovering algorithm based on local causality analysis,LCAHD)。LCAHD算法给出了因果结构熵的定义,将因果知识和不确定性知识相融合,以因果关系的不确定性程度作为隐变量存在的判定依据,并对这一依据进行了理论上的论证。LCAHD算法首先通过寻找目标变量的马尔科夫毯来提取局部依赖结构,并基于扰动学习获得扰动数据,联合扰动数据和观测数据学习局部依赖结构中的因果关系;然后利用因果结构熵对局部因果结构中因果关系的不确定性进行度量,并利用隐变量和因果关系不确定性之间的相关性判定条件,确定隐变量的存在性。分别针对标准网络和股票网络进行了实验,结果表明,该算法能准确地确定隐变量的位置,具有较好的解释性。

关键词: 隐变量, 马尔科夫毯, 扰动学习, 因果关系分析, 因果结构熵