• 学术研究 •

### 分布式随机方差消减梯度下降算法topkSVRG

1. 1. 中国科学院 软件研究所 软件工程技术研发中心，北京 100190
2. 中国科学院大学，北京 100190
• 出版日期:2018-07-01 发布日期:2018-07-06

### Distributed Stochastic Variance Reduction Gradient Descent Algorithm topkSVRG

WANG Jianfei, KANG Liangyi, LIU Jie, YE Dan

1. 1. Technology Center of Software Engineering, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
2. University of Chinese Academy of Sciences, Beijing 100190, China
• Online:2018-07-01 Published:2018-07-06

Abstract:

Machine learning problems are usually converted into an objective function to optimize, and the optimization algorithms are the important tool to solve the parameters in the objective function. Currently, stochastic gradient descent (SGD) is one of the most widely used optimization methods, but it is susceptible to noise gradient so as to get sub-linear convergence rate. The improved stochastic variance reduction gradient (SVRG) can achieve linear convergence rate, but SVRG is a serial algorithm. In order to deal with the distributed training problem of large-scale data set, this paper designs topkSVRG based on SVRG algorithm. The improvement is that the master node maintains a global model, and the local nodes update the local model according to the local data. In each epoch, the global model is updated by selecting k local models which have the smallest distance from the current global model. Usually with a bigger k, the model can converge faster, while with a smaller k, the convergence rate can be guar-   anteed. topkSVRG has been proved linear convergence rate by theoretical analysis. topkSVRG is implemented on Spark, and experiments demonstrate its efficiency compared with Mini-Batch SGD, CoCoA, Splash, and so on.