计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (10): 2273-2285.DOI: 10.3778/j.issn.1673-9418.2103059

• 高性能计算 • 上一篇    下一篇

面向GPU的直方图统计图像增强并行算法

肖汉1,3, 孙陆鹏1, 李彩林2,+(), 周清雷3   

  1. 1.郑州师范学院 信息科学与技术学院,郑州 450044
    2.山东理工大学 建筑工程学院,山东 淄博 255000
    3.郑州大学 计算机与人工智能学院,郑州 450001
  • 收稿日期:2021-03-17 修回日期:2021-05-25 出版日期:2022-10-01 发布日期:2021-06-23
  • 通讯作者: + E-mail: licailin@sdut.edu.cn
  • 作者简介:肖汉(1970—),男,湖北武汉人,博士,教授,CCF高级会员,主要研究方向为高性能计算、图像处理、大型并行数值软件。
    孙陆鹏(1970—),男,河南巩义人,硕士,讲师,CCF会员,主要研究方向为并行计算、图像处理。
    李彩林(1985—),男,安徽安庆人,博士,副教授,主要研究方向为数字摄影测量与计算机视觉、数字图像处理。
    周清雷(1962—),男,河南郑州人,博士,教授,博士生导师,CCF杰出会员,主要研究方向为并行算法、图像处理、并行计算。
  • 基金资助:
    国家自然科学基金(41601496);国家自然科学基金(61572444);河南省高等学校重点科研项目(22A520049);黄河中下游数字地理技术教育部重点实验室(河南大学)开放基金(GTYR202004);自然资源部大湾区地理环境监测重点实验室(深圳大学)开放基金(SZU51029202003);山东省艺术科学重点课题(ZD202008267);山东省艺术科学重点课题(201806353)

GPU-Oriented Parallel Algorithm for Histogram Statistical Image Enhancement

XIAO Han1,3, SUN Lupeng1, LI Cailin2,+(), ZHOU Qinglei3   

  1. 1. School of Information Science and Technology, Zhengzhou Normal University, Zhengzhou 450044, China
    2. School of Civil and Architectural Engineering, Shandong University of Technology, Zibo, Shandong 255000, China
    3. School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450001, China
  • Received:2021-03-17 Revised:2021-05-25 Online:2022-10-01 Published:2021-06-23
  • About author:XIAO Han, born in 1970, Ph.D., professor, senior member of CCF. His research interests include high performance computing, image processing and large-scale parallel numerical software.
    SUN Lupeng, born in 1970, M.S., lecturer, member of CCF. His research interests include parallel computing and image processing.
    LI Cailin, born in 1985, Ph.D., associate professor. His research interests include digital photogrammetry and computer vision and digital image processing.
    ZHOU Qinglei, born in 1962, Ph.D., professor, Ph.D. supervisor, distinguished member of CCF. His research interests include parallel algorithm, image processing and parallel computing.
  • Supported by:
    National Natural Science Foundation of China(41601496);National Natural Science Foundation of China(61572444);Key Scientific Research Projects of Colleges and Universities of Henan Province(22A520049);Open Fund of Key Laboratory of Geospatial Technology for the Middle and Lower Yellow River Regions (Henan University) Ministry of Education(GTYR202004);Open Foundation of the Key Laboratory of Geo-Environmental Monitoring of Great Bay Area (Shenzhen University) the Ministry of Natural Resources of China(SZU51029202003);Key Project of Art Science in Shandong Province(ZD202008267);Key Project of Art Science in Shandong Province(201806353)

摘要:

直方图统计在图像增强和目标检测等领域有着重要的应用。然而,随着图像规模不断增大、实时性要求越来越高,直方图统计局部增强算法的处理过程较慢,达不到预期满意的速度。针对这一不足,在图形处理器(GPU)平台上实现了直方图统计图像增强算法的并行处理,提升了处理大幅面数字图像的处理速度。首先,通过充分利用统一计算设备架构(CUDA)活动线程块和活动线程来并行处理不同的子图像块和像素点,提升了数据访问的效率。然后,采用内核配置参数优化和数据并行计算技术,实现了直方图统计图像增强算法在GPU平台上的并行化。最后,采用主机端和设备端间高效的数据传输模式,进一步缩短了系统在异构计算平台上的执行时间。研究表明,对于像幅大小不同的图像,图像直方图统计并行算法的处理速度相比CPU串行算法均有两个数量级的提高,处理一幅像幅大小为3 241×3 685的图像需要787.11 ms,并行算法的处理速度提高了261.35倍,为实现实时大规模图像处理奠定了良好基础。

关键词: 直方图统计, 局部增强, 局部均值, 图形处理器(GPU), 统一计算设备架构(CUDA), 并行算法

Abstract:

Histogram statistics has important applications in the fields of image enhancement and target detection. However, with the increasing size of the image and the higher real-time requirements, the processing process of the histogram statistical local enhancement algorithm is slow and cannot reach the expected satisfactory speed. In view of this deficiency, this paper realizes the parallel processing of histogram statistical image enhancement algorithm on graphics processing unit (GPU) platform, which improves the processing speed of large format digital images. Firstly, the efficiency of data access is improved by making full use of compute unified device architecture (CUDA) active thread block and active thread to process different sub-image blocks and pixels in parallel. Then, the paralle-lization of histogram statistical image enhancement algorithm on GPU platform is realized by using kernel configu-ration parameter optimization and data parallel computing technology. Finally, the efficient data transmission mode between the host and the device is adopted, which further shortens the execution time of the system on the hetero-geneous computing platform. The results show that for images with different image sizes, the processing speed of the image histogram statistical parallel algorithm is two orders of magnitude higher than that of the CPU serial algorithm. It takes 787.11 ms to process an image with an image size of 3241×3685. The processing speed of the parallel algo-rithm is increased by 261.35 times. It lays a good foundation for the realization of real-time large-scale image processing.

Key words: histogram statistics, local enhancement, local mean, graphics processing unit (GPU), compute unified device architecture (CUDA), parallel algorithm

中图分类号: