计算机科学与探索 ›› 2018, Vol. 12 ›› Issue (4): 671-680.DOI: 10.3778/j.issn.1673-9418.1703010

• 理论与算法 • 上一篇    

快速多维标度算法研究

屈太国1,2+,蔡自兴3   

  1. 1. 衡阳师范学院 计算机科学与技术学院,湖南 衡阳 421002
    2. 智能信息处理与应用湖南省重点实验室,湖南 衡阳 421002
    3. 中南大学 信息科学与工程学院,长沙 410083
  • 出版日期:2018-04-01 发布日期:2018-04-04

Research on Fast Algorithms of Classical Multidimensional Scaling

QU Taiguo1,2+, CAI Zixing3   

  1. 1. College of Computer Science and Technology, Hengyang Normal University, Hengyang, Hunan 421002, China
    2. Hunan Provincial Key Laboratory of Intelligent Information Processing and Application, Hengyang, Hunan 421002, China
    3. School of Information Science and Engineering, Central South University, Changsha 410083, China
  • Online:2018-04-01 Published:2018-04-04

摘要: 经典多维标度法(classical multidimensional scaling,CMDS)是一种常用的数据降维和可视化方法。随着数据规模的扩大,CMDS的运算时间急剧增加。为了提高CMDS的计算速度,研究了3种适用于不同距离矩阵的快速算法。通过预先确定枢轴,减少了不必要的距离计算,提出了一种基于FastMap的快速算法。基于分而治之策略,提出了一种新的算法dcMDS(divide-and-conquer based MDS)。通过合理地选择标志点集,确保LMDS(landmark multidimensional scaling)能得到与CMDS一致的解。当样本内在维数远小于样本个数时,这些算法都能得到与CMDS完全一致的解,并且在速度上有大幅提高。实验证实了这3种算法与CMDS的一致性以及高效性。

关键词: 经典多维标度法, iLMDS, iFastMap, dcMDS

Abstract: Classical multidimensional scaling (CMDS) is a commonly used method for dimensionality reduction and data visualization. With the rapid growth in data size, the running time of CMDS increases dramatically. To speed up CMDS, this paper studies three fast algorithms, which are suitable for applications with different distance matrices. By selecting the pivots in advance, the unnecessary calculation of distance is avoided, therefore, a fast algorithm based on FastMap is put forward. Based on divide-and-conquer strategy, a new fast algorithm called dcMDS (divide-and-conquer based MDS) is proposed. By properly choosing landmark points, the LMDS (landmark multidimensional scaling) is ensured to get the same solution as that of CMDS. When the intrinsic dimension of the sample is far less than the sample size, these algorithms can obtain the same solutions as that of CMDS with high speed. The experimental results verify the consistency with CMDS and the efficiency of these fast algorithms.

Key words: classical multidimensional scaling, iLMDS, iFastMap, dcMDS