计算机科学与探索 ›› 2010, Vol. 4 ›› Issue (3): 240-246.DOI: 10.3778/j.issn.1673-9418.2010.03.006

• 学术研究 • 上一篇    下一篇

基于TF-IDF相似度的标签聚类方法

韩 敏,唐常杰+,段 磊,李 川,巩 杰   

  1. 四川大学 计算机学院,成都 610065
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2010-03-15 发布日期:2010-03-15
  • 通讯作者: 唐常杰

TF-IDF Similarity Based Method for Tag Clustering

HAN Min, TANG Changjie+, DUAN Lei, LI Chuan, GONG Jie   

  1. School of Computer Science, Sichuan University, Chengdu 610065, China
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-03-15 Published:2010-03-15
  • Contact: TANG Changjie

摘要: 社会标签系统是Web 2.0中提出的新概念,旨在更好地表达用户的兴趣和意愿。标签聚类是社会标签数据挖掘中一个非常重要的研究课题。标签相似度的计算是标签聚类的关键技术。主要工作包括:(1)提出了一种基于TF-IDF的标签相似度计算方法和基于该相似度的聚类算法;(2)分析了影响标签相似度的条件;(3)通过实验表明:与已有方法相比,新方法的准确性更高。

关键词: 标签聚类, 相似度, 社会标签系统, TF-IDF技术

Abstract: As a new concept of Web 2.0, social tagging system aims at expressing users’ interests clearly and specifically. Tag clustering is an important research topic in social tagging system mining. Evaluation similarity among social tags is the key technique in tag clustering. The main contributions include: (1) introduce a new method to calculate the tag similarity based on TF-IDF, and propose a clustering algorithm based on the new method; (2) analyze the conditions that influence tag similarity; (3) conduct extensive experiments to demonstrate that proposed method is more efficient compared with some methods proposed before.

Key words: tag clustering, similarity, social tagging system, TF-IDF

中图分类号: