计算机科学与探索 ›› 2024, Vol. 18 ›› Issue (1): 127-137.DOI: 10.3778/j.issn.1673-9418.2209065

• 图形·图像 • 上一篇    下一篇

结合密度图回归与检测的密集计数研究

高洁,赵心馨,于健,徐天一,潘丽,杨珺,喻梅,李雪威   

  1. 1. 天津大学 智能与计算学部,天津 300350
    2. 天津大学 天津市认知计算与应用重点实验室,天津 300350
    3. 天津大学 天津市先进网络技术与应用重点实验室,天津 300350
    4. 天津大学 未来技术学院,天津 300350
  • 出版日期:2024-01-01 发布日期:2024-01-01

Counting Method Based on Density Graph Regression and Object Detection

GAO Jie, ZHAO Xinxin, YU Jian, XU Tianyi, PAN Li, YANG Jun, YU Mei, LI Xuewei   

  1. 1. College of Intelligence and Computing, Tianjin University, Tianjin 300350, China
    2. Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin University, Tianjin 300350, China
    3. Tianjin Key Laboratory of Advanced Network Technology and Application, Tianjin University, Tianjin 300350, China
    4. School of Future Technology, Tianjin University, Tianjin 300350, China
  • Online:2024-01-01 Published:2024-01-01

摘要: 针对基于检测以及基于密度图两种主流的密集计数方法中,基于检测的方法召回率较低、基于密度图的方法缺失目标物体位置信息的问题,将检测任务与回归任务相结合后提出一种基于密度图回归的检测计数方法,可以实现对密集场景中目标物体的计数以及定位,对两种方法进行优势互补,在提高召回率的同时,实现标定所有目标物体的位置信息。为提取出更加丰富的特征信息以面对复杂的数据场景,网络提出特征金字塔优化模块,该模块纵向融合底层高分辨特征与顶层抽象语义特征,横向融合同尺寸的特征,丰富目标物体的语义表达;考虑到密集计数场景中目标物体所占像素比例较低的问题,提出一种针对小目标的注意力机制,通过对输入图像构建掩膜以增强网络对目标物体的注意力,从而提高网络的检测敏感性。实验结果表明,所提出方法在保持准确率基本不变的情况下,大幅度提高了召回率,同时可准确标定目标物体位置,有效提供输入目标图像的计数以及定位信息,在工业以及生态等各种领域具有广泛的应用前景。

关键词: 密集计数, 目标检测, 深度学习, 密度图回归, 特征金字塔

Abstract: In response to the low recall rate of detection-based methods and the problem of missing target location information in density-based methods, which are the two mainstream dense-counting methods, a detection and counting method based on density map regression is proposed by combining the two tasks, achieving the counting and positioning of target objects in dense scenes. Complementing the advantages of two methods not only improves recall rate but also calibrates all targets. To extract richer feature information to deal with complex data scenarios, a feature pyramid optimization module is proposed, which vertically fuses low-level high-resolution features with top-level abstract semantic features and horizontally fuses same-size features to enrich the semantic expression of target objects. To address the issue of low pixel proportions occupied by target objects in dense counting scenarios, an attention mechanism for small targets is proposed to improve the network’s detection sensitivity, which can enhance the attention of the network to target objects by constructing a mask on the input image. Experimental results demonstrate that the proposed method significantly improves recall rate and accurately locates targets while maintaining accuracy, effectively providing counting and positioning information of input image, which has a wide range of application prospects in various fields such as industry and ecology.

Key words: intensive count, target detection, deep learning, density map regression, feature pyramid