Journal of Frontiers of Computer Science and Technology ›› 2011, Vol. 5 ›› Issue (5): 433-445.

• 学术研究 • Previous Articles     Next Articles

Genetic Evolution Based Parallelized SPA Churn Prediction Algorithm

DENG Xiaolong, WANG Bai, WU Bin, ZHAO Haizhou   

  1. Beijing Key Laboratory of Intelligent Telecommunications Software and Multimedia, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-05-01 Published:2011-05-01

遗传演化SPA流失预测算法及并行化

邓小龙, 王 柏, 吴 斌, 赵海舟   

  1. 北京邮电大学 智能通信软件与多媒体北京市重点实验室, 北京 100876

Abstract: Data mining based customer churn prediction is significant for telecom operator to promote customer service quality in keen competition among operators. A new churn prediction algorithm based on genetic evolution with less running time and better Lift value is proposed to promote predicting accuracy and adaptive ability of prediction model. The new algorithm GASPA (genetic algorithm based SPA) uses genetic evolution from artificial in-telligence to optimize parameters of psychology model SPA (spreading activation). GASPA algorithm is proved to reach better accuracy value than fixed-step method and better churn prediction Lift value than SPA on real mobile dataset. Aiming to handle very large scale dataset, the paralleled GASPA on MapReduce (M-GASPA) is proposed to enlarge data scale which can be processed. According to experiment results of iteration time, M-GASPA is a much faster algorithm than SPA while Lift value is promoted.

Key words: churn prediction, complex network, data mining, genetic evolution, cloud computing

摘要: 随着电信行业竞争日益激烈, 基于数据挖掘的客户流失预测对于电信运营商提升客户服务质量具有重要意义。为提升客户流失预测效率和提高预测模型的泛化能力, 引入人工智能的遗传演化思想改进了基于心理学扩散模型SPA(spreading activation)的流失预测算法, 提出了基于遗传演化的流失预测算法GASPA (genetic algorithm based SPA)。GASPA在演化中能自主学习和优化模型参数, 通过在真实电话呼叫数据和短消息数据上实验, 发现GASPA在精确度上性能优于固定步长方法, 在Lift曲线值上性能优于SPA, 显著提高了SPA的Lift曲线值, 增强了SPA的流失预测效果。为处理海量电信数据, 实现了在云计算平台上的并行化方案M-GASPA(MapReduce-GASPA), 在提高GASPA可处理数据规模的同时降低了运行时间。

关键词: 流失预测, 复杂网络, 数据挖掘, 遗传演化, 云计算