Journal of Frontiers of Computer Science and Technology ›› 2014, Vol. 8 ›› Issue (9): 1129-1136.DOI: 10.3778/j.issn.1673-9418.1406018

Previous Articles     Next Articles

Web Object Attribute Labeling Based on Constrained Conditional Random Fields

WU Qin+, HUANG Yanjiao   

  1. School of Internet of Things Engineering, Jiangnan University, Wuxi, Jiangsu 214122, China
  • Online:2014-09-01 Published:2014-09-03

基于约束条件随机场的Web对象属性标注

吴  秦+,黄彦姣   

  1. 江南大学 物联网工程学院,江苏 无锡 214122

Abstract: Conditional random fields model is one of the best statistical models of attribute labeling for Web objects. To overcome the problem that conditional random fields model does not take full advantage of the relationship between Web objects and attribute labels, this paper proposes a boosted constrained conditional random fields model. Motivated by the maximum margin criterion, the proposed model introduces constraints and boosting factor into the conditional random fields model to improve the accuracy of attribute labeling. Maximum likelihood estimation is used for the weights of characteristic functions and the Viterbi algorithm is applied for labeling. To get the best boosting factor, the concept of validation set is introduced in the dataset. The experimental results show that the labeling accuracy is improved effectively.

Key words: constrained conditional random fields, boosting factor, attribute labeling, Web object, maximum margin

摘要: 条件随机场模型是目前处理Web对象属性标注问题的最佳统计模型。为解决条件随机场模型不能充分利用Web对象和属性标签之间的特征关系这一问题,提出了一种增强约束条件随机场模型。借鉴最大间隔的思想,在原有条件随机场模型中增加约束条件和增强因子以提高模型标注正确率。使用最大似然参数估计方法估计模型特征函数的权重参数,并用Viterbi算法进行预测。在数据集中引入验证集的概念,以获得最优增强因子。实验结果表明,该模型有效地提高了Web对象属性标注正确率。

关键词: 约束条件随机场, 增强因子, 属性标注, Web对象, 最大间隔