Journal of Frontiers of Computer Science and Technology ›› 2015, Vol. 9 ›› Issue (9): 1056-1065.DOI: 10.3778/j.issn.1673-9418.1411049

Previous Articles     Next Articles

Design of Cloud Framework for Symptom Self-Inspection Services Based on Massive Medical Data

ZHOU Zuojian1,2+, LIN Wenmin1,2, WANG Binbin1,2, PAN Jingui1,2   

  1. 1. State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210046, China
    2. Department of Computer Science and Technology, Nanjing University, Nanjing 210046, China
  • Online:2015-09-01 Published:2015-12-11



  1. 1. 南京大学 计算机软件新技术国家重点实验室,南京 210046
    2. 南京大学 计算机科学与技术系,南京 210046

Abstract:  With the increase of the sub-health people in current society, symptom self-inspection services have become more and more important. However, the establishment of the regional health cloud platform based on the EHR (electronic health record) provides the data support of the symptom self-inspection services. For example, people can find similar EHRs through this platform. As proposing the symptom self-inspection services based on the cloud framework, people face the challenge that a large amount of the EHRs should be acquired, stored, searched and analyzed. To solving those problems, this paper proposes the symptom self-inspection services based on the cloud framework. Firstly, this paper builds a Hadoop cluster to store and retrieve the massive medical data so that the response time of searching the EHRs can be improved. Secondly, this paper designs the distributed search node cluster based on the Lucene project to retrieve, analyze and filter the massive EHRs in real time. Moreover, this paper discusses the implementation of the symptom self-inspection services which includes the selection of searching nodes, the indexing of EHRs, the method of calculating the similarity of EHRs and the sorting algorithm. In the end, this paper conducts an experiment which proves the scalability and effectiveness of the proposed cloud framework.

Key words: cloud framework, symptom self-inspection services, massive medical data, Hadoop, Lucene

摘要: 随着当前社会“亚健康”人群的增加,症状自查服务显得愈发重要。各地基于居民健康档案的区域卫生信息平台的建立,为症状自查服务实现提供了数据基础,但是人们仍面临着海量电子病历的获取、存储、搜索以及数据分析计算等诸多挑战。鉴于上述问题,提出了一种基于云框架的症状自查服务模型。首先建立了Hadoop集群,用来存储海量医疗数据以及建立索引,提高电子病历的搜索响应时间。其次设计了基于Lucene的分布式搜索节点集群,用来对海量的电子病历进行实时检索、数据分析和隐私过滤。此外,对症状自查服务的实现进行了讨论,包括搜索节点的选择、病历索引文件的建立、病历相似度的计算及排序方法。最后,通过实验证实了症状自查服务的云框架模型具有可扩展性和有效性。

关键词: 云框架, 症状自查服务, 海量医疗数据, Hadoop, Lucene