Survey of Camouflaged Object Detection Based on Deep Learning

doi:10.3778/j.issn.1673-9418.2206078

Journal of Frontiers of Computer Science and Technology ›› 2022, Vol. 16 ›› Issue (12): 2734-2751.DOI: 10.3778/j.issn.1673-9418.2206078

• Surveys and Frontiers • Previous Articles Next Articles

Survey of Camouflaged Object Detection Based on Deep Learning

SHI Caijuan¹^,²^,⁺(), REN Bijuan¹^,², WANG Ziwen¹^,², YAN Jinwei¹^,², SHI Ze¹^,²

1. College of Artificial Intelligence, North China University of Science and Technology, Tangshan, Hebei 063210, China
2. Hebei Key Laboratory of Industrial Intelligent Perception, Tangshan, Hebei 063210, China

Received:2022-06-20 Revised:2022-08-29 Online:2022-12-01 Published:2022-12-16
About author:SHI Caijuan, born in 1977, Ph.D., professor, M.S. supervisor, member of CCF. Her research interests include image processing, computer vision, machine learning, etc.
REN Bijuan, born in 1999, M.S. candidate. Her research interests include image processing, computer vision and machine learning.
WANG Ziwen, born in 1993, M.S. candidate. Her research interests include image processing, computer vision and machine learning.
YAN Jinwei, born in 1996, M.S. candidate. His research interests include image processing, computer vision and machine learning.
SHI Ze, born in 1994, M.S. candidate. His research interests include image processing, computer vision and machine learning.
Supported by:
Outstanding Youth Fund of North China University of Science and Technology(JQ201715);Talent Foundation of Tangshan City(A202110011)

基于深度学习的伪装目标检测综述

史彩娟¹^,²^,⁺(), 任弼娟¹^,², 王子雯¹^,², 闫巾玮¹^,², 石泽¹^,²

1.华北理工大学人工智能学院，河北唐山 063210
2.河北省工业智能感知重点实验室，河北唐山 063210

通讯作者: +E-mail: scj-blue@163.com
作者简介:史彩娟（1977—），女，河北唐山人，博士，教授，硕士生导师，CCF会员，主要研究方向为图像处理、计算机视觉、机器学习等。
任弼娟（1999—），女，甘肃定西人，硕士研究生，主要研究方向为图像处理、计算机视觉、机器学习。
王子雯（1993—），女，山东临沂人，硕士研究生，主要研究方向为图像处理、计算机视觉、机器学习。
闫巾玮（1996—），男，河北唐山人，硕士研究生，主要研究方向为图像处理、计算机视觉、机器学习。
石泽（1994—），男，河北唐山人，硕士研究生，主要研究方向为图像处理、计算机视觉、机器学习。
基金资助:
华北理工大学杰出青年基金(JQ201715);唐山市人才项目(A202110011)

Abstract

Abstract:

Camouflaged object detection (COD) based on deep learning is an emerging visual detection task, which aims to detect the camouflaged objects “perfectly” embedded in the surrounding environment. However, most exiting work primarily focuses on building different COD models with little summary work for the existing methods. Therefore, this paper summarizes the existing COD methods based on deep learning and discusses the future development of COD. Firstly, 23 existing COD models based on deep learning are introduced and analyzed according to five detection mechanisms: coarse-to-fine strategy, multi-task learning strategy, confidence-aware learning strategy, multi-source information fusion strategy and transformer-based strategy. The advantages and disadvantages of each strategy are analyzed in depth. And then, 4 widely used datasets and 4 evaluation metrics for COD are introduced. In addition, the performance of the existing COD models based on deep learning is compared on four datasets, including quantitative comparison, visual comparison, efficiency analysis, and the detection effects on camouflaged objects of different types. Furthermore, the practical applications of COD in medicine, industry, agriculture, military, art, etc. are mentioned. Finally, the deficiencies and challenges of existing methods in complex scenes, multi-scale objects, real-time performance, practical application requirements, and COD in other multimodalities are pointed out, and the potential directions of COD are discussed.

Key words: camouflaged object detection (COD), deep learning, feature enhancement

摘要：

基于深度学习的伪装目标检测（COD）是一项新兴的视觉检测任务，其目的是精确且高效地检测出“完美”嵌入周围环境中的伪装目标。目前大多数工作旨在构建不同的伪装目标检测模型，对现有模型的归纳总结及深入分析的综述性工作还很少。因此，对基于深度学习的伪装目标检测模型进行了全面分析和总结，并探讨了伪装目标检测未来的研究方向。首先对基于深度学习的23个伪装目标检测模型分别从由粗到细策略、多任务学习策略、置信感知学习策略、多源信息融合策略以及基于Transformer共5个角度进行了分类介绍，并对每种策略的优劣进行了深入分析；其次介绍了伪装目标检测广泛使用的4个数据集以及4种评估准则；然后对现有基于深度学习的伪装目标检测模型在4个数据集上进行了性能比较，包括定量比较、视觉比较和效率分析，并分析了这些模型对不同类型目标的检测效果；接着简单介绍了伪装目标检测在医学、工业、农业、军事、艺术等领域的应用；最后指出了现有方法在复杂场景、多尺度目标、实时性、实际应用需求、多模态等方面存在的不足和挑战，并探讨了伪装目标检测未来的研究方向。

关键词: 伪装目标检测（COD）, 深度学习, 特征增强

CLC Number:

TP391.4

SHI Caijuan, REN Bijuan, WANG Ziwen, YAN Jinwei, SHI Ze. Survey of Camouflaged Object Detection Based on Deep Learning[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(12): 2734-2751.

史彩娟, 任弼娟, 王子雯, 闫巾玮, 石泽. 基于深度学习的伪装目标检测综述[J]. 计算机科学与探索, 2022, 16(12): 2734-2751.

Figures/Tables 15

Fig.1 Various types of camouflaged objects selected from 4 COD datasets

Fig.2 Map of camouflaged object detection methods based on deep learning

Fig.3 Different feature fusion strategies

Fig.4 Camouflaged objects with edge fuzziness and texture deception

Fig.5 Four types of supervision labels widely used in DGNet

Table 1 Analysis and comparison of different types of camouflaged object detection methods

策略	方法	描述	优点	缺点
由粗到细策略	特征融合（SINet^[18]、D²CNet^[20]、CubeNet^[22]、C²FNet^[23]）	构建多尺度特征连接器如PDC^[21]、NCD、X形连接以及使用注意力引导的融合方式进行融合细化	能在不增加额外线索的前提下充分利用多层特征中有用的多尺度信息	容易造成计算冗余，且目标与背景相似的区域并未被关注
	分心挖掘（PFNet^[24]、 SINetV2^[25]）	特征相减以反向注意的形式促使模型关注伪装目标中不易被检测的部分	以简单的减法运算达到最直接的细化目的，使得模型参数量少，运行速度快	可利用的补充信息较少
	边缘线索（BASNet^[26]、 ERRNet^[30]、SegMaR^[31]）	以显式提取边缘信息或隐式增加边缘感知损失的方式关注边界信息，进行目标边界细化	能获取精细的边缘细节	使用边缘监督学习到的特征响应通常会引入噪声，尤其是复杂场景下
多任务学习策略	定位/排序+分割（LSR^[32]）	分割之前找到伪装目标所在位置；对分割任务的结果按区分难易度进行排名	首次对伪装程度进行探索，能为人类观察伪装目标的难易程度提供一定指引	基于候选区域先检测再分割，容易产生冗余的分割图数量，使得分割任务学习较慢
	分类+分割（ANet^[33]）	在分割之前对图像中的像素进行分类以检测是否有伪装目标的存在	让模型聚焦于包含伪装目标的区域，能一定程度上排除非伪装和背景的干扰	图像中非伪装的显著目标极具迷惑性，因此分类流可能会产生错误判断导致分割失败
	仿生攻击+分割（MirrorNet^[34]）	检测翻转图像中的伪装特征作为仿生攻击流辅助原始图像的分割	促使模型关注纹理和形状特征，能一定程度上增强模型的抗干扰能力	图像翻转并不会改变低辨识度的纹理或者边缘，因此仅使用翻转图像作为补充得到的性能增益十分有限
	纹理检测+分割（TANet^[35]、 TINet^[36]、DGNet^[37]）	设计独特的纹理标签，并使用矩阵方法、卷积方法等来提取纹理特征	基于纹理特征的旋转不变性等特点，使得模型具有较强的抗噪能力	由于伪装目标存在纹理欺骗性，仅关注纹理，会导致检测性能有限
	边缘检测+分割（MGL^[38]）	将边缘检测作为与分割并行的一大任务，来明确建模边缘并推理两个任务的互补信息	使用类型化函数明确推理两个任务间的互补信息	提取的信息种类繁多，很难保证每次提取的信息都是有效的，而且基于图推理的模型使得模型运行速度较慢
置信感知学习策略	对抗训练策略（JCSOD^[42]）	引入对抗训练来显式建模由完全标注伪装目标的困难带来的不确定性	显式生成网络预测的置信度，为模型预测提供可解释性	包含对两种任务的学习，使得模型参数量巨大，推理速度慢
	动态监督策略（CANet^[43]）	利用预测图和真实标签的L1距离作为动态监督关注由硬像素引起的不确定性	能够引导网络重点学习预测不确定的区域上，模型的鲁棒性较高	得到的特征通常响应于伪装目标的稀疏边缘，因此容易在特征学习过程中引入噪声
	正则化约束策略（Zoom-Net^[44]）	在目标检测损失中增加了模糊预测的正则化惩罚项，来迫使模型关注不确定像素	以简单的计算降低模糊背景带来的干扰	多尺度图像的输入，会增加内存占用量和模型运算量
多源信息融合策略	RGB-D （DCNet^[45]）	利用现有的深度估计方法生成伪装深度图，并与RGB数据融合以实现RGB-D的COD	深度图中包含丰富的空间信息，可以作为有效的补充信息提升伪装目标检测性能	提取的深度信息不够准确；没有充分考虑到两种模态信息之间的互补性和差异性
多源信息融合策略	RGB+频域（FDNet^[46]）	使用离线离散余弦变换引入频域信息，同时用特征对齐融合RGB域和频域信息	频域信息的引入以及对所有频带系数的增强，使得模型可以提取判别性信息	使用离散余弦提取频域信息可能会产生方块效应，使得模型计算量增加
Transformer 策略	以Transformer为主干（T²Net^[56]）	使用密集Transformer主干进行伪装目标检测，并使用深监督等策略补充空间信息	长距离特性使模型捕获全局信息的能力更强，对大目标有着较好的检测效果	忽略了局部信息的获取，并且模型参数量巨大，训练耗时
Transformer 策略	Transformer+CNN （UGTR^[57]）	利用CNN生成的概率表示模型学习Transformer框架下伪装目标的不确定性	不确定性引导的上下文推理使得模型更多关注不确定区域	仅对不确定性建模，使得模型不确定响应区域总分布在弱边界和不可区分的纹理区域，学习过程中容易引入噪声

Table 1 Analysis and comparison of different types of camouflaged object detection methods

策略	方法	描述	优点	缺点
由粗到细策略	特征融合（SINet^[18]、D²CNet^[20]、CubeNet^[22]、C²FNet^[23]）	构建多尺度特征连接器如PDC^[21]、NCD、X形连接以及使用注意力引导的融合方式进行融合细化	能在不增加额外线索的前提下充分利用多层特征中有用的多尺度信息	容易造成计算冗余，且目标与背景相似的区域并未被关注
	分心挖掘（PFNet^[24]、 SINetV2^[25]）	特征相减以反向注意的形式促使模型关注伪装目标中不易被检测的部分	以简单的减法运算达到最直接的细化目的，使得模型参数量少，运行速度快	可利用的补充信息较少
	边缘线索（BASNet^[26]、 ERRNet^[30]、SegMaR^[31]）	以显式提取边缘信息或隐式增加边缘感知损失的方式关注边界信息，进行目标边界细化	能获取精细的边缘细节	使用边缘监督学习到的特征响应通常会引入噪声，尤其是复杂场景下
多任务学习策略	定位/排序+分割（LSR^[32]）	分割之前找到伪装目标所在位置；对分割任务的结果按区分难易度进行排名	首次对伪装程度进行探索，能为人类观察伪装目标的难易程度提供一定指引	基于候选区域先检测再分割，容易产生冗余的分割图数量，使得分割任务学习较慢
	分类+分割（ANet^[33]）	在分割之前对图像中的像素进行分类以检测是否有伪装目标的存在	让模型聚焦于包含伪装目标的区域，能一定程度上排除非伪装和背景的干扰	图像中非伪装的显著目标极具迷惑性，因此分类流可能会产生错误判断导致分割失败
	仿生攻击+分割（MirrorNet^[34]）	检测翻转图像中的伪装特征作为仿生攻击流辅助原始图像的分割	促使模型关注纹理和形状特征，能一定程度上增强模型的抗干扰能力	图像翻转并不会改变低辨识度的纹理或者边缘，因此仅使用翻转图像作为补充得到的性能增益十分有限
	纹理检测+分割（TANet^[35]、 TINet^[36]、DGNet^[37]）	设计独特的纹理标签，并使用矩阵方法、卷积方法等来提取纹理特征	基于纹理特征的旋转不变性等特点，使得模型具有较强的抗噪能力	由于伪装目标存在纹理欺骗性，仅关注纹理，会导致检测性能有限
	边缘检测+分割（MGL^[38]）	将边缘检测作为与分割并行的一大任务，来明确建模边缘并推理两个任务的互补信息	使用类型化函数明确推理两个任务间的互补信息	提取的信息种类繁多，很难保证每次提取的信息都是有效的，而且基于图推理的模型使得模型运行速度较慢
置信感知学习策略	对抗训练策略（JCSOD^[42]）	引入对抗训练来显式建模由完全标注伪装目标的困难带来的不确定性	显式生成网络预测的置信度，为模型预测提供可解释性	包含对两种任务的学习，使得模型参数量巨大，推理速度慢
	动态监督策略（CANet^[43]）	利用预测图和真实标签的L1距离作为动态监督关注由硬像素引起的不确定性	能够引导网络重点学习预测不确定的区域上，模型的鲁棒性较高	得到的特征通常响应于伪装目标的稀疏边缘，因此容易在特征学习过程中引入噪声
	正则化约束策略（Zoom-Net^[44]）	在目标检测损失中增加了模糊预测的正则化惩罚项，来迫使模型关注不确定像素	以简单的计算降低模糊背景带来的干扰	多尺度图像的输入，会增加内存占用量和模型运算量
多源信息融合策略	RGB-D （DCNet^[45]）	利用现有的深度估计方法生成伪装深度图，并与RGB数据融合以实现RGB-D的COD	深度图中包含丰富的空间信息，可以作为有效的补充信息提升伪装目标检测性能	提取的深度信息不够准确；没有充分考虑到两种模态信息之间的互补性和差异性
多源信息融合策略	RGB+频域（FDNet^[46]）	使用离线离散余弦变换引入频域信息，同时用特征对齐融合RGB域和频域信息	频域信息的引入以及对所有频带系数的增强，使得模型可以提取判别性信息	使用离散余弦提取频域信息可能会产生方块效应，使得模型计算量增加
Transformer 策略	以Transformer为主干（T²Net^[56]）	使用密集Transformer主干进行伪装目标检测，并使用深监督等策略补充空间信息	长距离特性使模型捕获全局信息的能力更强，对大目标有着较好的检测效果	忽略了局部信息的获取，并且模型参数量巨大，训练耗时
Transformer 策略	Transformer+CNN （UGTR^[57]）	利用CNN生成的概率表示模型学习Transformer框架下伪装目标的不确定性	不确定性引导的上下文推理使得模型更多关注不确定区域	仅对不确定性建模，使得模型不确定响应区域总分布在弱边界和不可区分的纹理区域，学习过程中容易引入噪声

Table 2 Main information of four camouflaged object detection datasets

数据集	年份	发表于	规模		类
数据集	年份	发表于	训练集	测试集	类
CHAMELEON	2018	Unpublished		76
CAMO	2019	CVIU	2 000	500	8
COD10K	2021	TPAMI	6 000	4 000	78
NC4K	2021	CVPR		4 121

Table 3 Quantitative comparison of camouflaged target detection methods based on deep learning

策略	模型	CHAMELEON				CAMO-Test				COD10K-Test				NC4K
策略	模型	$S_{α}$ ↑	$E_{ϕ}$ ↑	$F_{β}^{w}$ ↑	$M$ ↓	$S_{α}$ ↑	$E_{ϕ}$ ↑	$F_{β}^{w}$ ↑	$M$ ↓	$S_{α}$ ↑	$E_{ϕ}$ ↑	$F_{β}^{w}$ ↑	$M$ ↓	$S_{α}$ ↑	$E_{ϕ}$ ↑	$F_{β}^{w}$ ↑	$M$ ↓
1	SINet	0.869	0.891	0.740	0.044	0.751	0.771	0.606	0.100	0.771	0.806	0.551	0.051	0.808	0.871	0.769	0.058
	PFNet	0.882	0.942	0.810	0.033	0.782	0.852	0.695	0.085	0.800	0.868	0.660	0.040	0.829	0.888	0.784	0.052
	D²CNet	0.889	0.939	0.848	0.030	0.774	0.818	0.735	0.087	0.807	0.876	0.720	0.037	—	—	—	—
	C²F-Net	0.888	0.935	0.828	0.032	0.796	0.854	0.719	0.080	0.813	0.890	0.686	0.036	0.838	0.898	0.762	0.049
	SINetV2	0.888	0.942	0.816	0.030	0.820	0.882	0.743	0.070	0.815	0.887	0.680	0.037	0.847	0.903	0.805	0.048
	BASNet	0.914	0.954	0.866	0.022	0.749	0.796	0.646	0.096	0.802	0.855	0.677	0.038	0.817	0.859	0.732	0.058
	ERRNet	0.877	0.927	0.805	0.036	0.761	0.817	0.660	0.088	0.780	0.867	0.629	0.044	0.787	0.848	0.638	0.070
	CubeNet	0.873	0.928	0.787	0.037	0.788	0.838	0.682	0.085	0.795	0.864	0.644	0.041	—	—	—	—
	SegMaR	0.906	0.954	0.860	0.025	0.815	0.872	0.742	0.071	0.833	0.895	0.724	0.033	—	—	—	—
2	ANet	—	—	—	—	0.682	0.685	0.484	0.126	—	—	—	—	—	—	—	—
	MirrorNet	—	—	—	—	0.785	0.849	0.719	0.077	—	—	—	—	—	—	—	—
	LSR	0.893	0.938	0.839	0.033	0.793	0.826	0.725	0.085	0.793	0.868	0.685	0.041	0.839	0.883	0.779	0.053
	TANet	0.888	0.911	0.786	0.036	0.793	0.834	0.690	0.083	0.803	0.848	0.629	0.041	—	—	—	—
	TINet	0.874	0.916	0.783	0.038	0.781	0.847	0.678	0.087	0.793	0.848	0.645	0.043	0.829	0.879	0.734	0.055
	DGNet	—	—	—	—	0.839	0.901	0.769	0.057	0.822	0.896	0.693	0.033	0.857	0.911	0.784	0.042
	MGL	0.893	0.923	0.813	0.030	0.775	0.847	0.673	0.088	0.814	0.865	0.666	0.035	0.833	0.867	0.782	0.052
3	JCSOD	0.894	0.943	0.848	0.030	0.803	0.853	0.759	0.076	0.817	0.892	0.726	0.035	0.842	0.898	0.771	0.047
	CANet	0.885	0.940	0.831	0.029	0.799	0.865	0.770	0.075	0.809	0.885	0.703	0.035	0.842	0.904	0.803	0.047
	Zoom-Net	0.902	0.958	0.845	0.023	0.820	0.892	0.794	0.066	0.838	0.911	0.729	0.029	0.853	0.912	0.784	0.043
4	DCNet	0.895	0.951	0.856	0.027	0.819	0.881	0.798	0.069	0.829	0.903	0.751	0.032	0.855	0.910	0.825	0.042
4	FDNet	0.898	0.949	0.837	0.027	0.844	0.898	0.778	0.062	0.837	0.918	0.731	0.030	—	—	—	—
5	UGTR	0.888	0.918	0.796	0.031	0.785	0.859	0.686	0.086	0.818	0.850	0.667	0.035	0.839	0.874	0.747	0.052
5	T²Net	0.899	0.958	0.858	0.023	0.860	0.920	0.832	0.050	0.843	0.917	0.765	0.029	0.872	0.927	0.837	0.037

Table 3 Quantitative comparison of camouflaged target detection methods based on deep learning

策略	模型	CHAMELEON				CAMO-Test				COD10K-Test				NC4K
策略	模型	$S_{α}$ ↑	$E_{ϕ}$ ↑	$F_{β}^{w}$ ↑	$M$ ↓	$S_{α}$ ↑	$E_{ϕ}$ ↑	$F_{β}^{w}$ ↑	$M$ ↓	$S_{α}$ ↑	$E_{ϕ}$ ↑	$F_{β}^{w}$ ↑	$M$ ↓	$S_{α}$ ↑	$E_{ϕ}$ ↑	$F_{β}^{w}$ ↑	$M$ ↓
1	SINet	0.869	0.891	0.740	0.044	0.751	0.771	0.606	0.100	0.771	0.806	0.551	0.051	0.808	0.871	0.769	0.058
	PFNet	0.882	0.942	0.810	0.033	0.782	0.852	0.695	0.085	0.800	0.868	0.660	0.040	0.829	0.888	0.784	0.052
	D²CNet	0.889	0.939	0.848	0.030	0.774	0.818	0.735	0.087	0.807	0.876	0.720	0.037	—	—	—	—
	C²F-Net	0.888	0.935	0.828	0.032	0.796	0.854	0.719	0.080	0.813	0.890	0.686	0.036	0.838	0.898	0.762	0.049
	SINetV2	0.888	0.942	0.816	0.030	0.820	0.882	0.743	0.070	0.815	0.887	0.680	0.037	0.847	0.903	0.805	0.048
	BASNet	0.914	0.954	0.866	0.022	0.749	0.796	0.646	0.096	0.802	0.855	0.677	0.038	0.817	0.859	0.732	0.058
	ERRNet	0.877	0.927	0.805	0.036	0.761	0.817	0.660	0.088	0.780	0.867	0.629	0.044	0.787	0.848	0.638	0.070
	CubeNet	0.873	0.928	0.787	0.037	0.788	0.838	0.682	0.085	0.795	0.864	0.644	0.041	—	—	—	—
	SegMaR	0.906	0.954	0.860	0.025	0.815	0.872	0.742	0.071	0.833	0.895	0.724	0.033	—	—	—	—
2	ANet	—	—	—	—	0.682	0.685	0.484	0.126	—	—	—	—	—	—	—	—
	MirrorNet	—	—	—	—	0.785	0.849	0.719	0.077	—	—	—	—	—	—	—	—
	LSR	0.893	0.938	0.839	0.033	0.793	0.826	0.725	0.085	0.793	0.868	0.685	0.041	0.839	0.883	0.779	0.053
	TANet	0.888	0.911	0.786	0.036	0.793	0.834	0.690	0.083	0.803	0.848	0.629	0.041	—	—	—	—
	TINet	0.874	0.916	0.783	0.038	0.781	0.847	0.678	0.087	0.793	0.848	0.645	0.043	0.829	0.879	0.734	0.055
	DGNet	—	—	—	—	0.839	0.901	0.769	0.057	0.822	0.896	0.693	0.033	0.857	0.911	0.784	0.042
	MGL	0.893	0.923	0.813	0.030	0.775	0.847	0.673	0.088	0.814	0.865	0.666	0.035	0.833	0.867	0.782	0.052
3	JCSOD	0.894	0.943	0.848	0.030	0.803	0.853	0.759	0.076	0.817	0.892	0.726	0.035	0.842	0.898	0.771	0.047
	CANet	0.885	0.940	0.831	0.029	0.799	0.865	0.770	0.075	0.809	0.885	0.703	0.035	0.842	0.904	0.803	0.047
	Zoom-Net	0.902	0.958	0.845	0.023	0.820	0.892	0.794	0.066	0.838	0.911	0.729	0.029	0.853	0.912	0.784	0.043
4	DCNet	0.895	0.951	0.856	0.027	0.819	0.881	0.798	0.069	0.829	0.903	0.751	0.032	0.855	0.910	0.825	0.042
4	FDNet	0.898	0.949	0.837	0.027	0.844	0.898	0.778	0.062	0.837	0.918	0.731	0.030	—	—	—	—
5	UGTR	0.888	0.918	0.796	0.031	0.785	0.859	0.686	0.086	0.818	0.850	0.667	0.035	0.839	0.874	0.747	0.052
5	T²Net	0.899	0.958	0.858	0.023	0.860	0.920	0.832	0.050	0.843	0.917	0.765	0.029	0.872	0.927	0.837	0.037

Fig.6 Visual comparison of deep learning-based camouflaged object detection methods

Fig.7 Efficiency analysis of existing COD methods

Fig.8 Applications of COD in medicine

Fig.9 Applications of COD in industry

Fig.10 Applications of COD in agriculture

Fig.11 Camouflage soldier detection

Fig.12 Some animals embedded into landscape images

References 68

[1]	REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]// Proceedings of the Annual Conference on Neural Information Processing Systems 2015, Montreal, Dec 7-12, 2015. Red Hook: Curran Associates, 2015: 91-99.
[2]	LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 2999-3007.
[3]	HE K, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 2980-2988.
[4]	WU Y H, LIU Y, ZHANG L, et al. EDN: salient object detection via extremely-downsampled network[J]. IEEE Transactions on Image Processing, 2022, 31: 3125-3136. DOI URL
[5]	PANG Y, ZHAO X, ZHANG L, et al. Multi-scale interactive network for salient object detection[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 9410-9419.
[6]	FENG M, LU H, DING E. Attentive feedback network for boundary-aware salient object detection[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 1623-1632.
[7]	TANKUS A, YESHURUN Y. Detection of regions of interest and camouflage breaking by direct convexity estimation[C]// Proceedings of the 1998 IEEE Workshop on Visual Surveillance, Bombay, Dec 5- 8, 1998. Washington: IEEE Computer Society, 1998: 42-48.
[8]	ZHANG X, ZHU C, WANG S, et al. A Bayesian approach to camouflaged moving object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2016, 27(9): 2001-2013. DOI URL
[9]	BEIDERMAN Y, TEICHER M, GARCIA J, et al. Optical technique for classification, recognition and identification of obscured objects[J]. Optics Communications, 2010, 283(21): 4274-4282. DOI URL
[10]	GALUN M, SHARON E, BASRI R, et al. Texture segmen-tation by multiscale aggregation of filter responses and shape elements[C]// Proceedings of the 9th IEEE International Conference on Computer Vision, Nice, Oct 14-17, 2003. Washington: IEEE Computer Society, 2003: 716-723.
[11]	GUO H, DOU Y, TIAN T, et al. A robust foreground segmentation method by temporal averaging multiple video frames[C]// Proceedings of the 2008 International Conference on Audio, Language and Image Processing, Shanghai, Jul 7-9, 2008. Piscataway: IEEE, 2008: 878-882.
[12]	KAVITHA C, RAO B P, GOVARDHAN A. An efficient content based image retrieval using color and texture of image sub-blocks[J]. International Journal of Engineering Science and Technology, 2011, 3(2): 1060-1068.
[13]	HALL J R, CUTHILL I C, BADDELEY R, et al. Camouflage, detection and identification of moving targets[J]. Proceedings of the Royal Society B: Biological Sciences, 2013, 280(1758): 20130064. DOI URL
[14]	BI H, ZHANG C, WANG K, et al. Rethinking camouflaged object detection: models and datasets[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(9): 5708-5724. DOI URL
[15]	SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]// Proceedings of the 2015 International Conference on Learning Represen-tations, San Diego, May 7-9, 2015. Piscataway: IEEE, 2015: 1-14.
[16]	HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Con-ference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 770-778.
[17]	GAO S H, CHENG M M, ZHAO K, et al. Res2net: a new multi-scale backbone architecture[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 43(2): 652-662. DOI URL
[18]	FAN D P, JI G P, SUN G, et al. Camouflaged object detection[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 2774-2784.
[19]	WU Z, SU L, HUANG Q. Cascaded partial decoder for fast and accurate salient object detection[C]// Proceedings of the 2019 IEEE/CVF conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway:IEEE, 2019: 3907-3916.
[20]	WANG K, BI H, ZHANG Y, et al. D2C-Net: a dual-branch, dual-guidance and cross-refine network for camouflaged object detection[J]. IEEE Transactions on Industrial Electr-onics, 2022, 69(5): 5364-5374.
[21]	RONNEBERGER O, FISCHER P, BROX T. U-net: con-volutional networks for biomedical image segmentation[C]// LNCS 9351: Proceedings of the 2015 International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Oct 5-9, 2015. Cham:Springer, 2015: 234-241.
[22]	ZHUGE M, LU X, GUO Y, et al. CubeNet: X-shape connection for camouflaged object detection[J]. Pattern Recognition, 2022, 127: 108644. DOI URL
[23]	SUN Y, CHEN G, ZHOU T, et al. Context-aware cross-level fusion network for camouflaged object detection[J]. arXiv:2105.12555, 2021.
[24]	MEI H, JI G P, WEI Z, et al. Camouflaged object segmentation with distraction mining[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 8772-8781.
[25]	FAN D P, JI G P, CHENG M M, et al. Concealed object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(10): 6024-6042.
[26]	QIN X, FAN D P, HUANG C, et al. Boundary-aware segmentation network for mobile and web applications[J]. arXiv:2101.04704, 2021.
[27]	DE BOER P T, KROESE D P, MANNOR S, et al. A tutorial on the cross-entropy method[J]. Annals of Operations Research, 2005, 134(1): 19-67. DOI URL
[28]	WANG Z, SIMONCELLI E P, BOVIK A C. Multiscale structural similarity for image quality assessment[C]// Proceedings of the 37th Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, Nov 9-12, 2003. Piscataway:IEEE, 2003: 1398-1402.
[29]	MÁTTYUS G, LUO W, URTASUN R. DeepRoadMapper: extracting road topology from aerial images[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 3438-3446.
[30]	JI G P, ZHU L, ZHUGE M, et al. Fast camouflaged object detection via edge-based reversible re-calibration network[J]. Pattern Recognition, 2022, 123: 108414. DOI URL
[31]	JIA Q, YAO S, LIU Y, et al. Segment, magnify and reiterate: detecting camouflaged objects the hard way[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, Jun 19-20, 2022. Piscataway: IEEE, 2022: 4713-4722.
[32]	LV Y, ZHANG J, DAI Y, et al. Simultaneously localize, segment and rank the camouflaged objects[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 11591-11601.
[33]	LE T N, NGUYEN T V, NIE Z, et al. Anabranch network for camouflaged object segmentation[J]. Computer Vision and Image Understanding, 2019, 184: 45-56. DOI URL
[34]	YAN J, LE T N, NGUYEN K D, et al. MirrorNet: bio-inspired camouflaged object segmentation[J]. IEEE Access, 2021, 9: 43290-43300. DOI URL
[35]	REN J, HU X, ZHU L, et al. Deep texture-aware features for camouflaged object detection[J]. arXiv:2102.02996, 2021.
[36]	ZHU J, ZHANG X, ZHANG S, et al. Inferring camouflaged objects by texture-aware interactive guidance network[C]// Proceedings of the 35th AAAI Conference on Artificial Intelligence, the 33rd Conference on Innovative Applications of Artificial Intelligence, the 11th Symposium on Educational Advances in Artificial Intelligence. Menlo Park: AAAI, 2021: 3599-3607.
[37]	JI G P, FAN D P, CHOU Y C, et al. Deep gradient learning for efficient camouflaged object detection[J]. arXiv:2205.12853, 2022.
[38]	ZHAI Q, LI X, YANG F, et al. Mutual graph learning for camouflaged object detection[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 12997-13007.
[39]	KENDALL A, GAL Y. What uncertainties do we need in Bayesian deep learning for computer vision?[C]// Proceedings of the Annual Conference on Neural Information Processing Systems 2017, Long Beach, Dec 4-9, 2017. Red Hook: Curran Associates, 2017: 5574-5584.
[40]	SHEN Y, ZHANG Z, SABUNCU M R, et al. Real-time uncertainty estimation in computer vision via uncertainty-aware distribution distillation[C]// Proceedings of the 2021 IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, Jan 3-8, 2021. Piscataway: IEEE, 2021: 707-716.
[41]	LAKSHMINARAYANAN B, PRITZEL A, BLUNDELL C. Simple and scalable predictive uncertainty estimation using deep ensembles[C]// Proceedings of the Annual Conference on Neural Information Processing Systems 2017, Long Beach, Dec 4-9, 2017. Red Hook: Curran Associates, 2017: 6402-6413.
[42]	LI A, ZHANG J, LV Y, et al. Uncertainty-aware joint salient object and camouflaged object detection[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 10071-10081.
[43]	LIU J, ZHANG J, BARNES N. Confidence-aware learning for camouflaged object detection[J]. arXiv:2106.11641, 2021.
[44]	PANG Y W, ZHAO X Q, XIANG T Z, et al. Zoom in and out: a mixed-scale triplet network for camouflaged object detection[J]. arXiv:2203.02688, 2022.
[45]	ZHANG J, LV Y, XIANG M, et al. Depth confidence-aware camouflaged object detection[J]. arXiv:2106.13217, 2021.
[46]	ZHONG Y, LI B, TANG L, et al. Detecting camouflaged object in frequency domain[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, Jun 19-20, 2022. Piscataway: IEEE, 2022: 4504-4513.
[47]	LUO W, LI Y, URTASUN R, et al. Understanding the effective receptive field in deep convolutional neural networks[C]// Proceedings of the Annual Conference on Neural Information Processing Systems 2016, Barcelona, Dec 5-10, 2016. Red Hook: Curran Associates, 2016: 4898-4906.
[48]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the Annual Conference on Neural Information Processing Systems 2017, Long Beach, Dec 4-9, 2017. Red Hook: Curran Associates, 2017: 5998-6008.
[49]	周雪, 柏正尧, 陆倩杰, 等. 融合金字塔Transformer和轴向注意的结直肠息肉分割[J/OL]. 计算机工程与应用(2022-03-28)[2022-04-16]. https://kns.cnki.net/kcms/detail/11.2127.TP.20220325.1526.016.html.
	ZHOU X, BO Z R, LU Q J, et al. Colorectal polyp segmentation combining pyramid Transformer and axial attention[J/OL]. Computer Engineering and Applications(2022-03-28)[2022-04-16]. https://kns.cnki.net/kcms/detail/11.2127.TP.20220325.1526.016.html.
[50]	赵琛琦, 王华虎, 赵涓涓, 等. 视觉Transformer与多特征融合的脑卒中检测算法[J]. 中国图象图形学报, 2022, 27(3):923-934.
	ZHAO C Q, WANG H H, ZHAO J J, et al. Cerebral stroke detection algorithm for visual Transformer and multi-feature fusion[J]. Journal of Image and Graphics, 2022, 27(3): 923-934.
[51]	DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16 × 16 words: Transformers for image recognition at scale[J]. arXiv:2010.11929, 2020.
[52]	WANG W, XIE E, LI X, et al. Pyramid vision Transformer: a versatile backbone for dense prediction without convolutions[C]// Proceedings of the 2021 IEEE/CVF International Con-ference on Computer Vision, Montreal, Oct 10-17, 2021. Piscataway: IEEE, 2021: 548-558.
[53]	ZHENG S, LU J, ZHAO H, et al. Rethinking semantic segmentation from a sequence-to-sequence perspective with Transformers[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 6881-6890.
[54]	CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with Transformers[C]// LNCS 12346:Proc-eedings of the 16th European Conference on Computer Vision, Glasgow, Aug 23-28, 2020. Cham: Springer, 2020: 213-229.
[55]	LIU Z, LIN Y, CAO Y, et al. Swin Transformer: hierarchical vision transformer using shifted widows[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Oct 10-17, 2021. Piscataway: IEEE, 2021: 10012-10022.
[56]	MAO Y, ZHANG J, WAN Z, et al. Transformer transforms salient object detection and camouflaged object detection[J]. arXiv:2104.10127, 2021.
[57]	YANG F, ZHAI Q, LI X, et al. Uncertainty-guided Trans-former reasoning for camouflaged object detection[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Oct 10-17, 2021. Piscataway: IEEE, 2021: 4146-4155.
[58]	FAN D P, CHENG M M, LIU Y, et al. Structure-measure: a new way to evaluate foreground maps[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 4548-4557.
[59]	FAN D P, GONG C, CAO Y, et al. Enhanced-alignment measure for binary foreground map evaluation[C]// Proceedings of the 27th International Joint Conference on Artificial Intelligence, 2018: 698-704.
[60]	MARGOLIN R, ZELNIK-MANOR L, TAL A. How to evaluate foreground maps?[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, Jun 23-28, 2014. Washington: IEEE Computer Society, 2014: 248-255.
[61]	PERAZZI F, KRÄHENBÜHL P, PRITCH Y, et al. Saliency filters: contrast based filtering for salient region detection[C]// Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, Jun 16-21, 2012. Washington: IEEE Computer Society, 2012: 733-740.
[62]	JI G P, XIAO G, CHOU Y C, et al. Video polyp segmentation: a deep learning perspective[J]. arXiv:2203.14291, 2022.
[63]	FAN D P, ZHOU T, JI G P, et al. Inf-net: automatic covid-19 lung infection segmentation from CT images[J]. IEEE Transactions on Medical Imaging, 2020, 39(8): 2626-2637. DOI URL
[64]	HE T, LIU Y, XU C, et al. A fully convolutional neural network for wood defect location and identification[J]. IEEE Access, 2019, 7: 123453-123462. DOI URL
[65]	张文霞. 基于图像处理技术的蝗虫识别算法研究[D]. 呼和浩特: 内蒙古大学, 2020.
	ZHANG W X. Research on locust recognition algorithm based on image processing technology[D]. Hohhot: Inner Mongolia University, 2020.
[66]	刘芳, 刘玉坤, 林森. 基于改进型YOLO的复杂环境下番茄果实快速识别方法[J]. 农业机械学报, 2020, 51(6):229-237.
	LIU F, LIU Y K, LIN S. Fast recognition method for tomatoes under complex environments based on improved YOLO[J]. Transactions of the Chinese Society of Agri-cultural Machinery, 2020, 51(6):229-237.
[67]	CHU H K, HSU W H, MITRA N J, et al. Camouflage images[J]. ACM Transactions on Graphics, 2010, 29(4): 51.
[68]	CHENG X, XIONG H, FAN D P, et al. Implicit motion handling for video camouflaged object detection[C]// Pro-ceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, Jun 19-20, 2022. Piscataway: IEEE, 2022: 13864-13873.

Survey of Camouflaged Object Detection Based on Deep Learning

基于深度学习的伪装目标检测综述

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 15

References 68

Related Articles 15

Recommended Articles 0

Metrics

[1]	ZHANG Lu, LU Tianliang, DU Yanhui. Overview of Facial Deepfake Video Detection Methods [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(1): 1-26.
[2]	WANG Shichen, HUANG Kai, CHEN Zhigang, ZHANG Wendong. Survey on 3D Human Pose Estimation of Deep Learning [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(1): 74-87.
[3]	LIANG Jiali, HUA Baojian, LYU Yashuai, SU Zhenyu. Loop Invariant Code Motion Algorithm for Deep Learning Operators [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(1): 127-139.
[4]	WANG Jianzhe, WU Qin. Salient Object Detection Based on Coordinate Attention Feature Pyramid [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(1): 154-165.
[5]	YANG Caidong, LI Chengyang, LI Zhongbo, XIE Yongqiang, SUN Fangwei, QI Jin. Review of Image Super-resolution Reconstruction Algorithms Based on Deep Learning [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(9): 1990-2010.
[6]	ZHANG Xiangping, LIU Jianxun. Overview of Deep Learning-Based Code Representation and Its Applications [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(9): 2011-2029.
[7]	LI Dongmei, LUO Sisi, ZHANG Xiaoping, XU Fu. Review on Named Entity Recognition [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(9): 1954-1968.
[8]	REN Ning, FU Yan, WU Yanxia, LIANG Pengju, HAN Xi. Review of Research on Imbalance Problem in Deep Learning Applied to Object Detection [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(9): 1933-1953.
[9]	LYU Xiaoqi, JI Ke, CHEN Zhenxiang, SUN Runyuan, MA Kun, WU Jun, LI Yidong. Expert Recommendation Algorithm Combining Attention and Recurrent Neural Network [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(9): 2068-2077.
[10]	AN Fengping, LI Xiaowei, CAO Xiang. Medical Image Classification Algorithm Based on Weight Initialization-Sliding Window CNN [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(8): 1885-1897.
[11]	ZENG Fanzhi, XU Luqian, ZHOU Yan, ZHOU Yuexia, LIAO Junwei. Review of Knowledge Tracing Model for Intelligent Education [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(8): 1742-1763.
[12]	LIU Yi, LI Mengmeng, ZHENG Qibin, QIN Wei, REN Xiaoguang. Survey on Video Object Tracking Algorithms [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(7): 1504-1515.
[13]	ZHAO Xiaoming, YANG Yijiao, ZHANG Shiqing. Survey of Deep Learning Based Multimodal Emotion Recognition [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(7): 1479-1503.
[14]	XIA Hongbin, XIAO Yifei, LIU Yuan. Long Text Generation Adversarial Network Model with Self-Attention Mechanism [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(7): 1603-1610.
[15]	SUN Fangwei, LI Chengyang, XIE Yongqiang, LI Zhongbo, YANG Caidong, QI Jin. Review of Deep Learning Applied to Occluded Object Detection [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(6): 1243-1259.