计算机科学与探索 ›› 2018, Vol. 12 ›› Issue (5): 730-740.DOI: 10.3778/j.issn.1673-9418.1709035

• 数据库技术 • 上一篇    下一篇

在线招聘场景下的简历活跃度预测

史舒扬1,张智鹏1,郭    龙1,邵蓥侠1,崔    斌2+   

  1. 1. 北京大学 信息科学技术学院 高可信软件技术教育部重点实验室,北京 100871
    2. 北京大学 深圳研究生院,广东 深圳 518055
  • 出版日期:2018-05-01 发布日期:2018-05-07

Resume Activeness Prediction in Online Recruitment Scenarios

SHI Shuyang1, ZHANG Zhipeng1, GUO Long1, SHAO Yingxia1, CUI Bin2+   

  1. 1. Key Lab of High Confidence Software Technologies, Ministry of Education, School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China
    2. Peking University Shenzhen Graduate School, Shenzhen, Guangdong 518055, China
  • Online:2018-05-01 Published:2018-05-07

摘要: 在信息时代,在线招聘平台承担了大量的招聘任务,平台向求职者推荐合适的职位,并向招聘者推荐合适的简历。但是在推荐简历的时候,平台难以获知用户是否已找到工作,常会在求职成功以后继续推送,导致平台资源的浪费和用户体验的损失。基于这一情况,提出了在线招聘场景下的简历活跃度预测问题,旨在通过预测未来活跃度高的求职者,对其重点推送,从而应对这一问题。现有的活跃度预测方案,大都在社交场景下,结合社交网络的特点设计适应性的模型,但特点不同导致这些方案在招聘场景下并不适用。结合真实数据分析了在线招聘场景的数据特征,提出4个场景特点——高度动态性、用户黏度低、双向匹配、召回优先等。据此,有针对性地提出了招聘平台下的简历活跃度预测方法(resume activeness prediction,RAP)。RAP能适应上述前3项特点,并通过调节筛选参数γ满足召回优先。在58招聘真实数据的实验中,RAP模型的AUC达到了0.817。

关键词: 用户活跃度预测, 在线招聘, 分类, 数据分析

Abstract: In the era of Internet, a lot of recruitments happen on online recruitment platforms. These platforms recommend jobs for applicants and meanwhile recommend resumes to corporations. However, it is almost impossible for the platforms to know whether the applicant has found a job. As a result, resumes are still being recommended to corporations even if the applicant has found a job, which leads to a waste of the platform resource as well as unsatisfactory user experience. This paper formalizes the resume activeness prediction problem in online recruitment scenarios, which aims to find highly active applicants so that the platform recommends the active ones and discards the inactive ones and therefore escalates user experience. Current solutions for user activity level predication are often restricted to certain scenarios like social networks, and they utilize scenario-aware features. Unfortunately, these features are not applicable in online recruitment scenarios. With a careful study of real-world recruitment data, this paper summarizes four characteristics of online recruitment scenarios, which are hyper-dynamism, low user viscosity, bidirectional matching and the priority of recall over precision in the predication result. Based on these characteristics, this paper proposes a model named RAP (resume activeness prediction), which carefully handles these characteristics and also provides a parameter γ to deal with the priority of recall. The extensive experiments on real-world data from 58 Recruitment Website demonstrate that the AUC of RAP can achieve 0.817.

Key words: user activeness prediction, online recruitment, classification, data analysis