Journal of Frontiers of Computer Science and Technology ›› 2019, Vol. 13 ›› Issue (8): 1370-1379.DOI: 10.3778/j.issn.1673-9418.1806079

Previous Articles     Next Articles

Modeling Personalized Constructivist Extraction of Domain News

REN Binbin, XIE Zhenping, LIU Yuan   

  1. 1.School of Digital Media, Jiangnan University, Wuxi, Jiangsu 214122, China
    2.Jiangsu Key Laboratory of Media Design and Software Technology (Jiangnan University), Wuxi, Jiangsu 214122, China
  • Online:2019-08-01 Published:2019-08-07

领域资讯的个性化建构抽取建模研究

任斌斌谢振平刘渊   

  1. 1.江南大学 数字媒体学院,江苏 无锡 214122
    2.江苏省媒体设计与软件技术重点实验室(江南大学),江苏 无锡 214122

Abstract: Network news reading has become the main way for personal knowledge acquisition in the Internet era. Wherein, how to improve the efficiency of acquiring news should be the core goal for personalized news service systems. In order to automatically gather suitable news for personalized requirements, depth-first and breadth-first Web page extraction strategies are firstly studied, and then a constructivist cognition based extraction model with balanced combined random walking strategy is newly proposed. The new model is based on the idea of constructivist learning theory of human. It models and simulates the process of progressively browsing and cognizing news information of users. The experimental analysis on healthy domain Web pages shows that, the proposed method can better simulate the human browsing selection procedures, which will supply a fundamental manner for developing personalized news extraction service system.

Key words: Web crawler, personalized reading, constructivist learning theory, domain news

摘要: 网络资讯阅读已成为互联网时代个人知识增长的主要手段,更有效地提升资讯获取效率是个性化资讯服务的核心目标。以自动地采集满足个性化需求的领域资讯为问题目标,考虑深度优先、广度优先的抽取策略,并提出平衡组合游走建构认知抽取模型对上述问题进行建模研究。该模型基于人类学习的建构主义理论,基于用户对资讯信息的逐渐认知过程进行建模表达,并模拟用户逐渐阅读抽取网络资讯的过程。在健康领域资讯上的实验分析表明,该模型可更好地模拟人类的资讯阅读选择过程,从而为个性化资讯抽取服务提供基础手段。

关键词: 网络爬虫, 个性化阅读, 建构主义学习理论, 领域资讯