计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (7): 1561-1569.DOI: 10.3778/j.issn.1673-9418.2101024

• 服务计算 • 上一篇    下一篇

融合多维信息的Web服务表征方法

张祥平1,2, 刘建勋1,2,+(), 肖巧翔2, 曹步清1,2   

  1. 1.湖南科技大学 服务计算与软件服务新技术湖南省重点实验室,湖南 湘潭 411201
    2.湖南科技大学 计算机科学与工程学院,湖南 湘潭 411201
  • 收稿日期:2021-01-07 修回日期:2021-03-08 出版日期:2022-07-01 发布日期:2021-03-25
  • 作者简介:张祥平(1993—),男,福建三明人,博士研究生,主要研究方向为服务计算、服务推荐。
    ZHANG Xiangping, born in 1993, Ph.D. candidate. His research interests include service computing and service recommendation.
    刘建勋(1970—),男,湖南衡阳人,博士,教授,主要研究方向为服务计算、云计算。
    LIU Jianxun, born in 1970, Ph.D., professor. His research interests include service computing and cloud computing.
    肖巧翔(1993—),女,湖南衡阳人,硕士,主要研究方向为服务聚类。
    XIAO Qiaoxiang, born in 1993, M.S. Her research interest is service clustering.
    曹步清(1979—),男,湖南湘潭人,博士,教授,主要研究方向为面向服务的软件工程、服务计算、云计算。
    CAO Buqing, born in 1979, Ph.D., professor. His research interests include service-oriented soft-ware engineering, service computing and cloud computing.
  • 基金资助:
    国家重点研发计划(2020YFB1707602)

Multidimensional Information-Based Web Service Representation Method

ZHANG Xiangping1,2, LIU Jianxun1,2,+(), XIAO Qiaoxiang2, CAO Buqing1,2   

  1. 1. Hunan Key Lab for Services Computing and Novel Software Technology, Hunan University of Science and Tech-nology, Xiangtan, Hunan 411201, China
    2. School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, Hunan 411201, China
  • Received:2021-01-07 Revised:2021-03-08 Online:2022-07-01 Published:2021-03-25
  • Supported by:
    the National Key Research and Development Program of China(2020YFB1707602)

摘要:

随着面向服务体系结构(SOA)技术的发展,Web服务的数量增长迅速。正确高效地对Web服务进行聚类或分类,能够有效地提高服务发现质量以及促进服务组合效率。然而,现有的Web服务建模方法(如LDA主题模型)难以从稀疏的Web服务数据中获得精确有效的信息用于Web服务聚类。针对这个问题,提出了一种融合多维信息的Web服务表征方法(MISR)。首先,将高斯混合模型和Word2Vec算法相结合生成包含Web服务功能主题信息和语义信息的词向量表征。然后,抽取出Web服务中包含的标签-词汇信息、流行度以及Web服务共现信息,结合前一步生成的向量生成包含多维信息的Web服务表征向量。最后,在Web服务聚类和Web服务分类两个任务上对MISR方法的有效性进行验证。在真实数据集上进行Web API服务聚类实验,实验结果表明,相比于LDA、Word2Vec、Doc2Vec、WT-LDA、HDP-SOM、GWSC,提出的方法在Micro-F1值上有38.8%、54.5%、15.3%、33.3%、44.7%、9.7%的提升。

关键词: Web服务表征, 高斯混合模型, Word2Vec, Web服务聚类, Web服务分类

Abstract:

With the development of service-oriented architecture (SOA) technology, the amount of Web service is increasing. Clustering or classifying Web services correctly are an effective way to improve the quality of Web service discovery and the efficiency of Web service composition. However, the existing Web service modeling methods (such as latent Dirichlet allocation topic model) are difficult to obtain accurate and effective Web service representation from a sparse Web service dataset for Web service clustering. To solve this problem, this paper proposes a multi-dimensional information-based Web service representation method (MISR). First, it generates word vectors which contain topic and semantic information implicit in Web service description with Gaussian mixture model and Word2Vec. Then, the MISR algorithm combines tag-word relationship, popularity, and co-occurrence information together for generating multi-dimensional information Web service representation. Web service clustering and Web service classification are used for evaluating the effectiveness of MISR. Based on a real-world dataset of API services, the experiment results show that compared with LDA, Word2Vec, Doc2Vec, WT-LDA, HDP-SOM, GWSC, the proposed method has 38.8%, 54.5%, 15.3%, 33.3%, 44.7%, 9.7% improvement in Micro-F1 value.

Key words: Web service representation, Gaussian mixture model, Word2Vec, Web service clustering, Web service classification

中图分类号: