人工智能与模式识别

### 组块3×2交叉验证的F1度量的方差分析

1. 1. 山西财经大学 应用数学学院，太原 030006
2. 山西大学 软件学院，太原 030006
出版日期:2016-08-01 发布日期:2016-08-09

### Analysis of Variance of F1 Measure Based on Blocked 3×2 Cross Validation

YANG Liu1+, WANG Yu2

1. 1. School of Applied Mathematics, Shanxi University of Finance & Economics, Taiyuan 030006, China
2. School of Software, Shanxi University, Taiyuan 030006, China
• Online:2016-08-01 Published:2016-08-09

Abstract: In the research on statistical machine learning, researchers often perform quantitative experiments to compare F1 measure of classification algorithms based on cross validation. In order to obtain statistically convincing conclusion, it is very important to estimate the uncertainty of F1 measure. In particular, the blocked 3×2 cross validation is demonstrated that its performance is superior to other cross validation methods such as the standard K-fold cross validation by theory and experiments. Thus, this paper studies theoretically the variance of F1 measure based on blocked 3×2 cross validation. The structure of variance shows that it is composed of three parts: block variance, within-block covariance and between-blocks covariance, which also implies that the commonly used sample variance may grossly underestimate or overestimate the real variance. The above theoretical results are validated by the experiments in simulated and real data sets through bar chart method. The experimental results show that the within-block covariance and between-blocks covariance are of same order as the block variance. The within-block and between-blocks correlations can not be neglected.