计算机科学与探索 ›› 2023, Vol. 17 ›› Issue (12): 2954-2966.DOI: 10.3778/j.issn.1673-9418.2211068

• 图形·图像 • 上一篇    下一篇

结合Graph-FPN与稳健优化的开放世界目标检测

谢斌红,张鹏举,张睿   

  1. 太原科技大学 计算机科学与技术学院,太原 030024
  • 出版日期:2023-12-01 发布日期:2023-12-01

Open World Object Detection Combining Graph-FPN and Robust Optimization

XIE Binhong, ZHANG Pengju, ZHANG Rui   

  1. College of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan 030024, China
  • Online:2023-12-01 Published:2023-12-01

摘要: 开放世界目标检测(OWOD)要求检测图像中所有已知和未知的目标类别,同时模型必须逐步学习新的类别以自适应更新知识。针对ORE方法存在的未知目标召回率低以及增量学习的灾难性遗忘等问题,提出一种基于图特征金字塔的稳健优化开放世界目标检测方法(GARO-ORE)。首先,利用Graph-FPN中的超像素图像结构以及上下文层和层次层的分层设计,获取丰富的语义信息并帮助模型准确定位未知目标;之后,利用稳健优化方法对不确定性综合考量,提出了基于平坦极小值的基类学习策略,极大限度地保证模型在学习新类别的同时避免遗忘先前学习到的类别知识;最后,采用基于知识迁移的新增类别权值初始化方法提高模型对新类别的适应性。在OWOD数据集上的实验结果表明,GARO-ORE在未知类别召回率上取得较优的检测结果,在10+10、15+5、19+1三种增量目标检测(iOD)任务中,其mAP指标分别提升了1.38、1.42和1.44个百分点。可以看出,GARO-ORE能够较好地提高未知目标检测的召回率,并且在有效缓解旧任务灾难性遗忘问题的同时促进后续任务的学习。

关键词: 开放世界目标检测(OWOD), 图特征金字塔网络, 平坦极小值, 知识迁移

Abstract: Open world object detection (OWOD) requires detecting all known and unknown object categories in the image, and the model must gradually learn new categories to adaptively update knowledge. Aiming at the problems of low recall rate of unknown objects and catastrophic forgetting of incremental learning in ORE (open world object detection) method, this paper proposes adjustable robust optimization of ORE based on graph feature pyramid (GARO-ORE). Firstly, using the superpixel image structure in Graph-FPN and the hierarchical design of context layer and hierarchical layer, rich semantic information can be obtained and the model can accurately locate unknown object. Then, using the robust optimization method to comprehensively consider the uncertainty, a base class learning strategy based on flat minimum is proposed, which greatly ensures that the model avoids forgetting the previously learnt category knowledge while learning new categories. Finally, the classification weights initiali-zation method based on knowledge transfer is used to improve the adaptability of the model to new classes. Experimental results on the OWOD dataset show that GARO-ORE achieves better detection results on the recall rate of unknown categories. In the three types of incremental object detection tasks of 10 + 10, 15 + 5, and 19 + 1, the mAP is increased by 1.38, 1.42 and 1.44 percentage points, respectively. It can be seen that GARO-ORE can improve the recall rate of unknown object detection, and promote the learning of subsequent tasks while effectively alleviating the catastrophic forgetting problem of old tasks.

Key words: open world object detection (OWOD), graph feature pyramid network, flat minimum, knowledge transfer