Journal of Frontiers of Computer Science and Technology ›› 2016, Vol. 10 ›› Issue (8): 1176-1183.DOI: 10.3778/j.issn.1673-9418.1603082

Previous Articles     Next Articles

Analysis of Variance of F1 Measure Based on Blocked 3×2 Cross Validation

YANG Liu1+, WANG Yu2   

  1. 1. School of Applied Mathematics, Shanxi University of Finance & Economics, Taiyuan 030006, China
    2. School of Software, Shanxi University, Taiyuan 030006, China
  • Online:2016-08-01 Published:2016-08-09

组块3×2交叉验证的F1度量的方差分析

杨  柳1+,王  钰2   

  1. 1. 山西财经大学 应用数学学院,太原 030006
    2. 山西大学 软件学院,太原 030006

Abstract: In the research on statistical machine learning, researchers often perform quantitative experiments to compare F1 measure of classification algorithms based on cross validation. In order to obtain statistically convincing conclusion, it is very important to estimate the uncertainty of F1 measure. In particular, the blocked 3×2 cross validation is demonstrated that its performance is superior to other cross validation methods such as the standard K-fold cross validation by theory and experiments. Thus, this paper studies theoretically the variance of F1 measure based on blocked 3×2 cross validation. The structure of variance shows that it is composed of three parts: block variance, within-block covariance and between-blocks covariance, which also implies that the commonly used sample variance may grossly underestimate or overestimate the real variance. The above theoretical results are validated by the experiments in simulated and real data sets through bar chart method. The experimental results show that the within-block covariance and between-blocks covariance are of same order as the block variance. The within-block and between-blocks correlations can not be neglected.

Key words: F1 measure, cross validation, variance, classification algorithm, simulated experiment

摘要: 在统计机器学习的研究中,研究者常常通过定量实验来对照基于交叉验证的分类算法的F1度量,为了得到统计可信的结论,估计它的不确定性是非常重要的。特别地,组块[3×2]交叉验证方法被大量理论和实验验证了它的性能优于诸如标准K折交叉验证的其他常用交叉验证方法。为此,理论上研究了基于组块[3×2]交叉验证的F1度量的方差。方差的结构表明它由块方差、块内协方差和块间协方差三部分组成,从而说明了广泛使用的样本方差估计可能严重地低估或高估真实的方差。通过条形图方法在模拟和真实数据上进行实验,验证了上述理论结果,实验结果表明块内、块间协方差和块方差是同阶的,块内和块间相关性是不可忽略的。

关键词: F1度量, 交叉验证, 方差, 分类算法, 模拟实验