计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (10): 2273-2285.DOI: 10.3778/j.issn.1673-9418.2103059
收稿日期:
2021-03-17
修回日期:
2021-05-25
出版日期:
2022-10-01
发布日期:
2021-06-23
通讯作者:
+ E-mail: licailin@sdut.edu.cn作者简介:
肖汉(1970—),男,湖北武汉人,博士,教授,CCF高级会员,主要研究方向为高性能计算、图像处理、大型并行数值软件。基金资助:
XIAO Han1,3, SUN Lupeng1, LI Cailin2,+(), ZHOU Qinglei3
Received:
2021-03-17
Revised:
2021-05-25
Online:
2022-10-01
Published:
2021-06-23
About author:
XIAO Han, born in 1970, Ph.D., professor, senior member of CCF. His research interests include high performance computing, image processing and large-scale parallel numerical software.Supported by:
摘要:
直方图统计在图像增强和目标检测等领域有着重要的应用。然而,随着图像规模不断增大、实时性要求越来越高,直方图统计局部增强算法的处理过程较慢,达不到预期满意的速度。针对这一不足,在图形处理器(GPU)平台上实现了直方图统计图像增强算法的并行处理,提升了处理大幅面数字图像的处理速度。首先,通过充分利用统一计算设备架构(CUDA)活动线程块和活动线程来并行处理不同的子图像块和像素点,提升了数据访问的效率。然后,采用内核配置参数优化和数据并行计算技术,实现了直方图统计图像增强算法在GPU平台上的并行化。最后,采用主机端和设备端间高效的数据传输模式,进一步缩短了系统在异构计算平台上的执行时间。研究表明,对于像幅大小不同的图像,图像直方图统计并行算法的处理速度相比CPU串行算法均有两个数量级的提高,处理一幅像幅大小为3 241×3 685的图像需要787.11 ms,并行算法的处理速度提高了261.35倍,为实现实时大规模图像处理奠定了良好基础。
中图分类号:
肖汉, 孙陆鹏, 李彩林, 周清雷. 面向GPU的直方图统计图像增强并行算法[J]. 计算机科学与探索, 2022, 16(10): 2273-2285.
XIAO Han, SUN Lupeng, LI Cailin, ZHOU Qinglei. GPU-Oriented Parallel Algorithm for Histogram Statistical Image Enhancement[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(10): 2273-2285.
主要功能 | 执行时间/ms | 执行时间占比/% |
---|---|---|
初始化直方图统计图像 | 36.00 | 0.032 |
灰度级概率估计计算 | 1 212.00 | 1.080 |
图像均值求解 | 10.00 | 0.009 |
图像方差求解 | 14.00 | 0.012 |
图像扩充 | 53.00 | 0.047 |
局部直方图统计计算 | 110 193.00 | 97.960 |
其他部分 | 969.00 | 0.860 |
总计 | 112 487.00 | 100.000 |
表1 直方图统计算法主要功能运行时间占比
Table 1 Proportion of running time of main functions of histogram statistical algorithm
主要功能 | 执行时间/ms | 执行时间占比/% |
---|---|---|
初始化直方图统计图像 | 36.00 | 0.032 |
灰度级概率估计计算 | 1 212.00 | 1.080 |
图像均值求解 | 10.00 | 0.009 |
图像方差求解 | 14.00 | 0.012 |
图像扩充 | 53.00 | 0.047 |
局部直方图统计计算 | 110 193.00 | 97.960 |
其他部分 | 969.00 | 0.860 |
总计 | 112 487.00 | 100.000 |
线程块中线程数 | 运算时间/ms |
---|---|
8×8 | 1 572.53 |
16×16 | 1 554.41 |
24×24 | 1 593.84 |
32×32 | 1 882.17 |
表2 线程块维度对运算速度的影响
Table 2 Influence of thread block dimension on computing speed
线程块中线程数 | 运算时间/ms |
---|---|
8×8 | 1 572.53 |
16×16 | 1 554.41 |
24×24 | 1 593.84 |
32×32 | 1 882.17 |
参数 | 设置 |
---|---|
CPU | Intel Core i5-7400(四核心), 主频3.2 GHz |
内存 | 8 GB |
显卡 | NVIDIA GTX 1070 |
显示存储器 | 8 GB GDDR5 |
CUDA核数 | 1 920 |
每个SM的最大激活线程数 | 2 048 |
线程块中的线程数 | 1 024 |
线程块维度 | 1 024×1 024×64 |
网格维度 | (231-1)×65 535×65 535 |
表3 实验环境配置
Table 3 Experimental environment configuration
参数 | 设置 |
---|---|
CPU | Intel Core i5-7400(四核心), 主频3.2 GHz |
内存 | 8 GB |
显卡 | NVIDIA GTX 1070 |
显示存储器 | 8 GB GDDR5 |
CUDA核数 | 1 920 |
每个SM的最大激活线程数 | 2 048 |
线程块中的线程数 | 1 024 |
线程块维度 | 1 024×1 024×64 |
网格维度 | (231-1)×65 535×65 535 |
像幅大小 | 串行处理时间/ms | 并行处理时间/ms | |
---|---|---|---|
OpenMP | CUDA | ||
547×754 | 7 494.00 | 5 677.27 | 40.23 |
1 246×1 652 | 38 338.00 | 21 538.20 | 156.49 |
2 265×2 768 | 112 487.00 | 52 564.02 | 432.14 |
3 241×3 685 | 205 714.00 | 84 655.96 | 787.11 |
4 357×4 872 | 370 229.00 | 120 204.22 | 1 554.41 |
5 361×5 763 | 592 134.00 | 170 643.80 | 2 556.55 |
表4 直方图统计算法运行时间对比
Table 4 Comparison of running time of histogram statistical algorithm
像幅大小 | 串行处理时间/ms | 并行处理时间/ms | |
---|---|---|---|
OpenMP | CUDA | ||
547×754 | 7 494.00 | 5 677.27 | 40.23 |
1 246×1 652 | 38 338.00 | 21 538.20 | 156.49 |
2 265×2 768 | 112 487.00 | 52 564.02 | 432.14 |
3 241×3 685 | 205 714.00 | 84 655.96 | 787.11 |
4 357×4 872 | 370 229.00 | 120 204.22 | 1 554.41 |
5 361×5 763 | 592 134.00 | 170 643.80 | 2 556.55 |
像幅大小 | 加速比 | |
---|---|---|
OpenMP | CUDA | |
547×754 | 1.32 | 186.28 |
1 246×1 652 | 1.78 | 244.98 |
2 265×2 768 | 2.14 | 260.30 |
3 241×3 685 | 2.43 | 261.35 |
4 357×4 872 | 3.08 | 238.18 |
5 361×5 763 | 3.47 | 231.61 |
表5 直方图统计算法性能对比
Table 5 Performance comparison of histogram statistical algorithm
像幅大小 | 加速比 | |
---|---|---|
OpenMP | CUDA | |
547×754 | 1.32 | 186.28 |
1 246×1 652 | 1.78 | 244.98 |
2 265×2 768 | 2.14 | 260.30 |
3 241×3 685 | 2.43 | 261.35 |
4 357×4 872 | 3.08 | 238.18 |
5 361×5 763 | 3.47 | 231.61 |
[1] |
CHEN X, ZHANG Y, LIN L, et al. Efficient anti-glare ceramic decals defect detection by incorporating homomorphic filte-ring[J]. Computer Systems Science and Engineering, 2021, 36(3): 551-564.
DOI URL |
[2] | HAMIDREZA S R, MOJTABA S, ANAHITA F K, et al. Apparent diffusion coefficient (ADC) and first-order histogram statistics in differentiating malignant versus benign menin-gioma in adults[J]. Iranian Journal of Radiology, 2019, 16(1): 1-9. |
[3] |
ADAMSKI M, SARNACKI K, SAEED K, et al. Binary hand-writing image enhancement by directional field-guided morp-hology[J]. Information Sciences, 2021, 551(11): 168-183.
DOI URL |
[4] |
ZHAO G X, WANG X S, CHENG Y H. Hyperspectral image classification based on local binary pattern and broad lear-ning system[J]. International Journal of Remote Sensing, 2020, 41(24): 9393-9417.
DOI URL |
[5] | IQBAL M, ASIFULLAH K, NAEEM A. Object detection using hybridization of static and dynamic feature spaces and its exploitation by ensemble classification[J]. Neural Compu-ting and Applications, 2019, 31(2): 347-361. |
[6] | KUSHBU S C, INBAMALAR T M. Interactive one way con-tour initialization for cardiac left ventricle and right vent-ricle segmentation using hybrid method[J]. Journal of Medical Imaging and Health Informatics, 2021, 11(4): 1037-1054. |
[7] |
ZHAO Y, OUYANG P, KANG W, et al. An STT-MRAM based in memory architecture for low power integral computing[J]. IEEE Transactions on Computers, 2019, 68(4): 617-623.
DOI URL |
[8] | YANG T, SHI H Y, LANG M Y, et al. ISAR imaging enhan-cement: exploiting deep convolutional neural network for signal reconstruction[J]. International Journal of Remote Sen-sing, 2020, 41(24): 9447-9468. |
[9] |
BOWDEN D C, SAGER K, FICHTNER A, et al. Connec-ting beamforming and kernel-based noise source inversion[J]. Geophysical Journal International, 2021, 224(3): 1607-1620.
DOI URL |
[10] |
DHAREJO F A, ZHOU Y C, DEEBA F, et al. A remote-sensing image enhancement algorithm based on patch-wise dark channel prior and histogram equalisation with colour correction[J]. IET Image Processing, 2020, 15(1): 47-56.
DOI URL |
[11] |
VORHIES J T, HOOVER A P, MADANAYAKE A. Adap-tive filtering of 4-D light field images for depth-based image enhancement[J]. IEEE Transactions on Circuits and Systems II-Express Briefs, 2021, 68(2): 787-791.
DOI URL |
[12] | SOLER J D. Using herschel and planck observations to deli-neate the role of magnetic fields in molecular cloud struc-ture[J]. Astronomy and Astrophysics, 2019, 629(A96): 1-26. |
[13] | 王化喆, 李德启. 空间域下基于直方图处理的图像增强算法[J]. 计算机光盘软件与应用, 2013(18): 96-97. |
WANG H Z, LI D Q. Image enhancement algorithm based on histogram processing in spatial domain[J]. Computer CD Software and Applications, 2013(18): 96-97. | |
[14] |
LEI H, CHANG X Y, WANG F, et al. A novel algorithm based on histogram processing of reliability for two-dimen-sional phase unwrapping[J]. Optik, 2015, 126(18): 1640-1644.
DOI URL |
[15] |
YANG J Q, ZHANG Q, CAO Z G. Multi-attribute statistics histograms for accurate and robust pairwise registration of range images[J]. Neurocomputing, 2017, 251(4): 54-67.
DOI URL |
[16] | CHU H, TONG Z, ZHAO S, et al. Particle filter sample texton feature for SAR image classification[J]. IEEE Geo-science and Remote Sensing Letters, 2015, 12(5): 1141-1145. |
[17] | 王勇, 刘雯. 改进的图像增强直方图统计方法[J]. 吉林大学学报(信息科学版), 2015, 33(5): 495-500. |
WANG Y, LIU W. Improved image enhancement histogram-equalization method[J]. Journal of Jilin University (Information Science Edition), 2015, 33(5): 495-500. | |
[18] |
YE L N, HOU Z J, ENG H L. Context aware image enhance-ment for online fish behaviour monitoring[J]. IET Image Processing, 2016, 10(2): 149-157.
DOI URL |
[19] | ZHU C, WANG J, LIU H, et al. Insect identification and counting in stored grain: image processing approach and application embedded in smartphones[J]. Mobile Informa-tion Systems, 2018, 24(6): 1-5. |
[20] | 周启双, 孙玉秋. 基于直方图统计的自适应图像增强改进算法[J]. 长江大学学报(自然科学版), 2015, 12(1): 49-56. |
ZHOU Q S, SUN Y Q. The enhancement algorithm of local contrast based on the statistics of histogram[J]. Journal of Yangtze University (Natural Science Edition), 2015, 12(1): 49-56. | |
[21] |
HE C, YE Y P, TIAN L, et al. A statistical distribution texton feature for synthetic aperture radar image classifica-tion[J]. Frontiers of Information Technology and Electronic Engineering, 2017, 18(10): 1614-1623.
DOI URL |
[22] | 吕新正, 张敏. 高密度环境下的脉冲分选技术研究[J]. 航天电子对抗, 2020, 36(1): 50-53. |
LYU X Z, ZHANG M. The technology research of pulse sorting based on high density environment[J]. Aerospace Elec-tronic Warfare, 2020, 36(1): 50-53. | |
[23] |
YANG R H, LI Q T, TAN J X, et al. Accurate road marking detection from noisy point clouds acquired by low-cost mobile LiDAR systems[J]. ISPRS International Journal of Geo-Information, 2020, 9(10): 1-14.
DOI URL |
[24] | 胡英帅. Shape Context算法的CUDA并行化实现[J]. 电脑编程技巧与维护, 2017(14): 5-7. |
HU Y S. CUDA parallel implementation of Shape Context algorithm[J]. Computer Programming Skills and Maintenance, 2017(14): 5-7. | |
[25] | KARBOWIAK Ł. Improving efficiency of automatic labe-ling by image transformations on CPU and GPU[C]// LNCS 12043: Proceedings of the 13th International Conference on Parallel Processing and Applied Mathematics, Poland, Sep 8-11, 2019. Cham: Springer, 2019: 479-490. |
[26] | GOCHO M, ARII M. Efficient GPU-based local histogram analyzer for change detection in satellite SAR images[C]// Proceedings of the 2020 IEEE International Geoscience and Remote Sensing Symposium, Waikoloa, Sep 26-Oct 2, 2020. Piscataway: IEEE, 2020: 304-307. |
[27] | 裴浩, 游小荣, 牛欣伟. 矿山三维空间数据距离直方图算法优化及加速[J]. 工矿自动化, 2017, 43(2): 55-60. |
PEI H, YOU X R, NIU X W. Optimization and acceleration of distance histogram algorithm of three-dimension space data of coal mine[J]. Industry and Mine Automation, 2017, 43(2): 55-60. | |
[28] | 陈坤, 黎煊, 刘品, 等. 基于FPGA 的图像形状边缘匹配算法的实现[J]. 科技创新与应用, 2016(11): 1-2. |
CHEN K, LI X, LIU P, et al. Implementation of image shape edge matching algorithm based on FPGA[J]. Techno-logy Innovation and Application, 2016(11): 1-2. | |
[29] | HONZATKO D, KRULIS M. Accelerating block-matching and 3D filtering method for image denoising on GPUs[J]. Jou-rnal of Real-Time Image Processing, 2019, 16(6): 2273-2287. |
[30] | NAYAGAM M G, RAMAR K. Reliable object recognition system for cloud video data based on LDP features[J]. Com-puter Communications, 2020, 149(1): 343-349. |
[31] | CHEN A G, YAN H Y. An improved fuzzy C-means cluste-ring for brain MR images segmentation[J]. Journal of Medi-cal Imaging and Health Informatics, 2021, 11(2): 386-390. |
[32] | RAD H S, SAFARI M, KAZEROONI A F, et al. Apparent diffusion coefficient (ADC) and first-order histogram statis-tics in differentiating malignant versus benign meningioma in adults[J]. Iranian Journal of Radiology, 2019, 16(1): 1-9. |
[33] |
JIANG N, JI Z X, WANG J, et al. Quantum image histo-gram statistics[J]. International Journal of Theoretical Physics, 2020, 59(11): 3533-3548.
DOI URL |
[1] | 文敏华,刘永志,鲍华,胡跃,沈泳星,韦建文,林新华. 声子BTE应用的并行和优化研究[J]. 计算机科学与探索, 2020, 14(8): 1288-1297. |
[2] | 武海鹏,文敏华,SEE Simon,林新华. 激光等离子体相互作用模拟的并行和加速研究[J]. 计算机科学与探索, 2018, 12(4): 550-558. |
[3] | 陈思汉,余建波. 基于二维局部均值分解的图像边缘检测算法[J]. 计算机科学与探索, 2016, 10(6): 847-855. |
[4] | 王虹旭,吴斌,刘旸. 基于Spark的并行图数据分析系统[J]. 计算机科学与探索, 2015, 9(9): 1066-1074. |
[5] | 陈军. 微波源器件模拟中的并行FDTD建模[J]. 计算机科学与探索, 2015, 9(11): 1295-1300. |
[6] | 贾伟乐,曹宗雁,王龙,迟学斌,高卫国,汪林望. Ultra-Mat:基于平面波的第一原理异构计算软件[J]. 计算机科学与探索, 2014, 8(7): 769-777. |
[7] | 韦向远,杨辉华,谢谱模. 基于CUDA的并行布谷鸟搜索算法设计与实现[J]. 计算机科学与探索, 2014, 8(6): 665-673. |
[8] | 覃子姗,顾璠,秦晓科,陈铭松. 基于GPU平台的有效字典压缩与解压缩技术[J]. 计算机科学与探索, 2014, 8(5): 525-536. |
[9] | 覃雄派. 星型模型上的高效百分点计算方法[J]. 计算机科学与探索, 2013, 7(5): 385-393. |
[10] | 文敏华,林新华,Simon Chong Wee See. 动态网格的DSMC方法在GPU上的并行[J]. 计算机科学与探索, 2013, 7(5): 472-479. |
[11] | 周国亮,王桂兰,朱永利. 多核处理器上的并行联机分析处理算法研究[J]. 计算机科学与探索, 2013, 7(2): 180-190. |
[12] | 董 辉+,马 垣,宫 玺. 一种新的概念格并行构造算法[J]. 计算机科学与探索, 2008, 2(6): 651-657. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||