Journal of Frontiers of Computer Science and Technology ›› 2010, Vol. 4 ›› Issue (3): 240-246.DOI: 10.3778/j.issn.1673-9418.2010.03.006

• 学术研究 • Previous Articles     Next Articles

TF-IDF Similarity Based Method for Tag Clustering

HAN Min, TANG Changjie+, DUAN Lei, LI Chuan, GONG Jie   

  1. School of Computer Science, Sichuan University, Chengdu 610065, China
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-03-15 Published:2010-03-15
  • Contact: TANG Changjie

基于TF-IDF相似度的标签聚类方法

韩 敏,唐常杰+,段 磊,李 川,巩 杰   

  1. 四川大学 计算机学院,成都 610065
  • 通讯作者: 唐常杰

Abstract: As a new concept of Web 2.0, social tagging system aims at expressing users’ interests clearly and specifically. Tag clustering is an important research topic in social tagging system mining. Evaluation similarity among social tags is the key technique in tag clustering. The main contributions include: (1) introduce a new method to calculate the tag similarity based on TF-IDF, and propose a clustering algorithm based on the new method; (2) analyze the conditions that influence tag similarity; (3) conduct extensive experiments to demonstrate that proposed method is more efficient compared with some methods proposed before.

Key words: tag clustering, similarity, social tagging system, TF-IDF

摘要: 社会标签系统是Web 2.0中提出的新概念,旨在更好地表达用户的兴趣和意愿。标签聚类是社会标签数据挖掘中一个非常重要的研究课题。标签相似度的计算是标签聚类的关键技术。主要工作包括:(1)提出了一种基于TF-IDF的标签相似度计算方法和基于该相似度的聚类算法;(2)分析了影响标签相似度的条件;(3)通过实验表明:与已有方法相比,新方法的准确性更高。

关键词: 标签聚类, 相似度, 社会标签系统, TF-IDF技术

CLC Number: