计算机科学与探索 ›› 2011, Vol. 5 ›› Issue (12): 1094-1104.

• 学术研究 • 上一篇    下一篇

MapReduce计算模型下的化合物LC-MS鉴定

黎建辉, 刘 勇, 王卫华, 周园春, 薛兴亚   

  1. 1. 中国科学院 计算机网络信息中心, 北京 100190
    2. 中国科学院 研究生院, 北京 100190
    3. 中国科学院 大连化学物理研究所 生物技术部, 辽宁 大连 116023

  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-12-01 发布日期:2011-12-01

LC-MS Compounds Identification under MapReduce

LI Jianhui1, LIU Yong, WANG Weihua, ZHOU Yuanchun, XUE Xingya   

  1. 1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China 2. Graduate University of Chinese Academy of Sciences, Beijing 100190, China 3. Biotechnology Department, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, Liaoning 116023, China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-12-01 Published:2011-12-01

摘要: 随着质谱技术的迅猛发展, 通过色谱质谱联用(liquid chromatography mass spectrometry, LC-MS)技术进行化合物鉴定成为近年来的研究热点。针对化合物LC-MS鉴定过程中便捷性和效率问题, 提出了质谱谱图数据预处理方法, 将包含两列数据的质谱文件转换成便于后续并行化处理的三列数据文件, 简化了化合物相似度计算; 同时结合MapReduce技术, 对标准品库进行等价切分, 提出了基于MapReduce计算模型的化合物LC-MS鉴定算法。实验结果表明, 这种基于MapReduce的并行化方式可以大大提高化合物LC-MS鉴定的效率。

关键词: MapReduce, 色谱质谱联用(LC-MS), 化合物鉴定

Abstract: With the rapid development of mass spectrometry, the method of identifying compounds using liquid chromatography mass spectrometry (LC-MS) has attracted many research interests recently. For the efficiency of LC-MS compounds identification, this paper proposes a preprocess method which changes the mass spectra data from two columns data files into three columns data files to simplify the calculation of compound similarity. It also proposes a parallel approach based on MapReduce for LC-MS compounds identification which splits the reference library. Experimental results show that the parallel approach based on MapReduce can greatly improve the effi-ciency of LC-MS compounds identification.

Key words: MapReduce, liquid chromatography mass spectrometry (LC-MS), compounds identification