Journal of Frontiers of Computer Science and Technology ›› 2010, Vol. 4 ›› Issue (7): 589-598.DOI: 10.3778/j.issn.1673-9418.2010.07.002

• 学术研究 • Previous Articles     Next Articles

Web Page Quality Estimation with User Behavior*

WANG Xiaoguang+; LIU Yiqun; JIN Yijiang;CEN Rongwei; MA Shaoping;RU Liyun   

  1. State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-07-14 Published:2010-07-14
  • Contact: WANG Xiaoguang

面向用户行为的页面质量评估*

王晓光+; 刘奕群; 金奕江;岑荣伟; 马少平; 茹立云

  

  1. 清华大学 计算机系 智能技术与系统国家重点实验室 清华信息科学与技术国家实验室(筹), 北京 100084
  • 通讯作者: 王晓光

Abstract: Page quality estimation in the search engine has a crucial role, and the traditional method is based on hyperlink structure analysis. Because of the complexity of the current Web environment, the traditional method cannot work well. To solve this, user behavior is paid much attention these years. User behavior can be divided into two types: Browsing behavior and searching behavior. With analysis into browsing behavior, user browsing graph is constructed. A new approach to use searching behavior is proposed, and user searching graph is constructed through this method. With the combination of these two graphs, user browsing-searching graph is constructed.Experimentalresults show that the performance of user browsing-searching graph is close to user browsing graph, and exceeds the whole Web graph. On the other side, the number of page that the user browsing-searching graph can evaluate is more than user searching graph.

Key words: page quality estimation, user behavior, user browsing graph, user searching graph, user browsingsearching graph

摘要: 页面质量评估在搜索引擎系统中具有极其关键的作用, 传统的方法是基于页面链接关系进行页面质量评估。但由于当前Web 环境的复杂性, 传统方法已经难以适应当前的Web 环境, 近年来, 用户行为被用来弥补完全依赖链接关系方法的不足。用户行为可以分为两类:浏览行为和搜索行为。利用浏览行为构造了用户浏览图; 提出了一种利用用户搜索行为的新方法, 此方法构造了用户搜索图; 合并用户浏览图和用户搜索图得到用户浏览搜索图。实验表明用户浏览搜索图的性能比较接近用户浏览图的性能, 并超过全网的性能, 同时用户浏览搜索图能够评价的页面数要大于用户浏览图。

关键词: 页面质量评估, 用户行为, 用户浏览图, 用户搜索图, 用户浏览搜索图

CLC Number: