计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (8): 1865-1876.DOI: 10.3778/j.issn.1673-9418.2012041
收稿日期:
2020-12-11
修回日期:
2021-02-04
出版日期:
2022-08-01
发布日期:
2021-03-03
通讯作者:
+E-mail: zhy16622553596@163.com作者简介:
何丽(1969—),女,安徽舒城人,博士,教授,硕士生导师,CCF专业会员,主要研究方向为数据挖掘、机器学习等。基金资助:
HE Li, ZHANG Hongyan(), FANG Wanlin
Received:
2020-12-11
Revised:
2021-02-04
Online:
2022-08-01
Published:
2021-03-03
About author:
HE Li, born in 1969, Ph.D., professor, M.S. supervisor, professional member of CCF. Her research interests include data mining, machine learning, etc.Supported by:
摘要:
对感兴趣的对象进行定位是计算机视觉应用的一个基础任务。显著实例分割通过对视觉上具有显著性的物体进行检测并对其进行像素级分割,可以获得感兴趣的实例类。单阶段显著实例分割网络(S4Net)为利用目标对象和其周围背景的特征分离能力,设计了一个新的区域特征抽取层ROIMasking。但由于卷积神经网络自身的特性,多次的卷积和上采样会造成实例边界信息缺失,导致边界分割粗糙,影响分割的精度。为了解决显著实例分割中的边界信息丢失问题,在S4Net的基础上借鉴目标边缘检测方法,提出了一种结合边界特征的端到端显著实例分割方法(MBCNet)。该方法设计了一个多尺度融合的边界特征提取分支,利用带有混合空洞卷积和残差网络结构的边界细化模块强化对实例边界信息的提取,并通过网络共享层实现了边界信息的传递;同时,为提高分割的精度,提出了一个新的边界-分割联合损失函数,实现了在同一个网络中对目标边界特征提取分支和实例分割分支的同步训练。实验结果显示,提出的方法在saliency instance数据集上的mAP0.5和mAP0.7分别达到88.90%和67.94%,比目前主流的显著实例分割方法S4Net分别提升了2.20个百分点和4.24个百分点。
中图分类号:
何丽, 张红艳, 房婉琳. 融合多尺度边界特征的显著实例分割[J]. 计算机科学与探索, 2022, 16(8): 1865-1876.
HE Li, ZHANG Hongyan, FANG Wanlin. Salient Instance Segmentation via Multiscale Boundary Characteristic Network[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(8): 1865-1876.
特征图 | 输入 | 通道数 | 卷积核 | 卷积核数 | 通道数 | 输出 |
---|---|---|---|---|---|---|
输入图像 | 320×320 | 3 | — | — | — | — |
F3 | 40×40 | 256 | 3×3 | 256 | 256 | 40×40 |
F4 | 20×20 | 256 | 3×3 | 256 | 256 | 20×20 |
F5 | 10×10 | 256 | 3×3 | 256 | 256 | 10×10 |
I1 | 20×20 | 256 | — | — | 256 | 40×40 |
I2 | 10×10 | 256 | — | — | 256 | 40×40 |
C1 | 40×40 | 256 | 3×3 | 128 | 128 | 40×40 |
BR1-c1 | 40×40 | 128 | 3×3 | 64 | 64 | 40×40 |
BR1-d1 | 40×40 | 64 | 3×3 | 64 | 64 | 40×40 |
BR1-cc1 | — | — | — | — | 256 | 40×40 |
BR1-c2 | 40×40 | 256 | 1×1 | 64 | 64 | 40×40 |
BR1-r1 | — | — | — | — | 128 | 40×40 |
Cc2 | — | — | — | — | 256 | 40×40 |
C2 | 40×40 | 256 | 1×1 | 64 | 64 | 40×40 |
BR4-c1 | 40×40 | 128 | 3×3 | 64 | 64 | 40×40 |
BR4-d1 | 40×40 | 64 | 3×3 | 64 | 64 | 40×40 |
BR4-cc1 | — | — | — | — | 256 | 40×40 |
BR4-c2 | 40×40 | 256 | 1×1 | 64 | 64 | 40×40 |
BR4-r1 | — | — | — | — | 128 | 40×40 |
C3 | 40×40 | 128 | 3×3 | 1 | 1 | 40×40 |
表1 目标实例的边界特征提取分支各层参数
Table 1 Parameters of each layer of boundary feature extraction branch of target instance
特征图 | 输入 | 通道数 | 卷积核 | 卷积核数 | 通道数 | 输出 |
---|---|---|---|---|---|---|
输入图像 | 320×320 | 3 | — | — | — | — |
F3 | 40×40 | 256 | 3×3 | 256 | 256 | 40×40 |
F4 | 20×20 | 256 | 3×3 | 256 | 256 | 20×20 |
F5 | 10×10 | 256 | 3×3 | 256 | 256 | 10×10 |
I1 | 20×20 | 256 | — | — | 256 | 40×40 |
I2 | 10×10 | 256 | — | — | 256 | 40×40 |
C1 | 40×40 | 256 | 3×3 | 128 | 128 | 40×40 |
BR1-c1 | 40×40 | 128 | 3×3 | 64 | 64 | 40×40 |
BR1-d1 | 40×40 | 64 | 3×3 | 64 | 64 | 40×40 |
BR1-cc1 | — | — | — | — | 256 | 40×40 |
BR1-c2 | 40×40 | 256 | 1×1 | 64 | 64 | 40×40 |
BR1-r1 | — | — | — | — | 128 | 40×40 |
Cc2 | — | — | — | — | 256 | 40×40 |
C2 | 40×40 | 256 | 1×1 | 64 | 64 | 40×40 |
BR4-c1 | 40×40 | 128 | 3×3 | 64 | 64 | 40×40 |
BR4-d1 | 40×40 | 64 | 3×3 | 64 | 64 | 40×40 |
BR4-cc1 | — | — | — | — | 256 | 40×40 |
BR4-c2 | 40×40 | 256 | 1×1 | 64 | 64 | 40×40 |
BR4-r1 | — | — | — | — | 128 | 40×40 |
C3 | 40×40 | 128 | 3×3 | 1 | 1 | 40×40 |
对比模型 | mAP0.5 | mAP0.7 |
---|---|---|
MSRNet | 65.30 | 52.30 |
S4Net | 86.70 | 63.70 |
MDNN | 84.90 | 67.87 |
MBCNet | 88.90 | 67.94 |
表2 不同模型的实验结果对比
Table 2 Comparison of experimental results of different models %
对比模型 | mAP0.5 | mAP0.7 |
---|---|---|
MSRNet | 65.30 | 52.30 |
S4Net | 86.70 | 63.70 |
MDNN | 84.90 | 67.87 |
MBCNet | 88.90 | 67.94 |
对比模型 | mAP | 1 | 2 | 3 | 4 |
---|---|---|---|---|---|
S4Net | mAP0.5 | 88.45 | 87.86 | 85.47 | 78.31 |
MBCNet | mAP0.5 | 93.00 | 90.97 | 85.02 | 78.20 |
S4Net | mAP0.7 | 71.28 | 68.24 | 57.15 | 37.58 |
MBCNet | mAP0.7 | 75.87 | 72.27 | 63.32 | 39.31 |
表3 不同实例个数的分割结果对比
Table 3 Comparison of segmentation results with different number of instances %
对比模型 | mAP | 1 | 2 | 3 | 4 |
---|---|---|---|---|---|
S4Net | mAP0.5 | 88.45 | 87.86 | 85.47 | 78.31 |
MBCNet | mAP0.5 | 93.00 | 90.97 | 85.02 | 78.20 |
S4Net | mAP0.7 | 71.28 | 68.24 | 57.15 | 37.58 |
MBCNet | mAP0.7 | 75.87 | 72.27 | 63.32 | 39.31 |
输入层 | mAP0.5/% | mAP0.7/% | Time/(s/iter) |
---|---|---|---|
F2,F3,F4,F5,F6 | 88.3 | 67.8 | 0.435 |
F2,F3,F4,F5 | 88.2 | 67.8 | 0.418 |
F3,F4,F5,F6 | 87.8 | 66.3 | 0.423 |
F2,F3,F4 | 88.5 | 67.3 | 0.410 |
F3,F4,F5 | 88.9 | 67.9 | 0.408 |
F4,F5,F6 | 87.9 | 66.6 | 0.413 |
F2,F3 | 87.7 | 66.2 | 0.390 |
F3,F4 | 87.4 | 66.0 | 0.388 |
F4,F5 | 88.2 | 66.3 | 0.395 |
表4 多尺度融合策略对分割结果的影响对比
Table 4 Comparison of effects of multi-scale fusion strategies on segmentation results
输入层 | mAP0.5/% | mAP0.7/% | Time/(s/iter) |
---|---|---|---|
F2,F3,F4,F5,F6 | 88.3 | 67.8 | 0.435 |
F2,F3,F4,F5 | 88.2 | 67.8 | 0.418 |
F3,F4,F5,F6 | 87.8 | 66.3 | 0.423 |
F2,F3,F4 | 88.5 | 67.3 | 0.410 |
F3,F4,F5 | 88.9 | 67.9 | 0.408 |
F4,F5,F6 | 87.9 | 66.6 | 0.413 |
F2,F3 | 87.7 | 66.2 | 0.390 |
F3,F4 | 87.4 | 66.0 | 0.388 |
F4,F5 | 88.2 | 66.3 | 0.395 |
组合 | mAP0.5 | mAP0.7 |
---|---|---|
Base | 87.7 | 65.1 |
Base+BR1+BR2+BR3 | 87.8 | 66.2 |
Base+BR4 | 88.2 | 65.3 |
Base+BR1+BR2+BR3+BR4 | 88.9 | 67.9 |
表5 边界细化模块各种组合的实验结果对比
Table 5 Comparison of experimental results of various combinations of BR block %
组合 | mAP0.5 | mAP0.7 |
---|---|---|
Base | 87.7 | 65.1 |
Base+BR1+BR2+BR3 | 87.8 | 66.2 |
Base+BR4 | 88.2 | 65.3 |
Base+BR1+BR2+BR3+BR4 | 88.9 | 67.9 |
对比模型 | mAP0.5/% | mAP0.7/% | Time/(s/iter) |
---|---|---|---|
MBCNet_noResidual | 86.8 | 64.7 | 0.400 |
MBCNet_noHDC | 86.9 | 64.1 | 0.408 |
MBCNet | 88.9 | 67.9 | 0.408 |
表6 混合空洞卷积和残差结构对分割结果影响对比
Table 6 Comparison of influence of HDC and residual structure on segmentation results
对比模型 | mAP0.5/% | mAP0.7/% | Time/(s/iter) |
---|---|---|---|
MBCNet_noResidual | 86.8 | 64.7 | 0.400 |
MBCNet_noHDC | 86.9 | 64.1 | 0.408 |
MBCNet | 88.9 | 67.9 | 0.408 |
Loss对比 | mAP0.5 | mAP0.7 |
---|---|---|
Lseg | 86.7 | 63.6 |
Lseg、Ledge | 87.8 | 65.3 |
Ledge-seg | 88.9 | 67.9 |
表7 不同组合的损失函数结果对比
Table 7 Comparison of loss function of different combinations %
Loss对比 | mAP0.5 | mAP0.7 |
---|---|---|
Lseg | 86.7 | 63.6 |
Lseg、Ledge | 87.8 | 65.3 |
Ledge-seg | 88.9 | 67.9 |
[1] | LI G B, XIE Y, LIN L, et al. Instance-level salient object segmentation[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 247-256. |
[2] | FAN R C, CHENG M M, HOU Q B, et al. S4Net:single stage salient-instance segmentation[C]// Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Washington: IEEE Computer Society, 2019: 6096-6105. |
[3] | SHELHAMER E, LONG J, DARRELL T. Fully convolu-tional networks for semantic segmentation[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, Jun 7-12, 2015. Washington: IEEE Com-puter Society, 2015: 3431-3440. |
[4] | REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelli-gence, 2017, 39(6): 1137-1149. |
[5] | HE K, GKIOXARI G, PIOTR D, et al. Mask R-CNN[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 2980-2988. |
[6] | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 779-788. |
[7] | TIAN Z, SHEN C, CHEN H, et al. FCOS: fully convolu-tional one-stage object detection[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway:IEEE, 2019: 9627-9636. |
[8] | BOLYA D, ZHOU C, XIAO F, et al. YOLACT: real-time instance segmentation[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway:IEEE, 2019: 9156-9165. |
[9] | XIE E, SUN P, SONG X, et al. PolarMask: single shot ins-tance segmentation with polar representation[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Washington: IEEE Computer Society, 2020: 12190-12199. |
[10] | POSNER M I. Neural mechanisms of selective visual atten-tion[J]. Annual Review of Neuroscience, 1995, 18(1): 193-222. |
[11] | ITTI L, KOCH C, NIEBUR E. A model of saliency-based visual attention for rapid scene analysis[J]. IEEE Transac-tions on Pattern Analysis and Machine Intelligence, 2002, 20(11): 1254-1259. |
[12] | LIU T, SUN J, ZHENG N N, et al. Learning to detect a salient object[C]// Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, Jun 17-22, 2007. Washington: IEEE Computer Society, 2007: 353-367. |
[13] | LI G, YU Y. Visual saliency based on multiscale deep fea-tures[C]// Proceedings of the 2015 IEEE Conference on Com-puter Vision and Pattern Recognition, Boston, Jun 7-12, 2015. Washington: IEEE Computer Society, 2015: 5455-5463. |
[14] | ZHAO R, OUYANG W, LI H, et al. Saliency detection by multi-context deep learning[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, Jun 7-12, 2015. Washington: IEEE Computer Society, 2015: 1265-1274. |
[15] | 于明, 李博昭, 于洋, 等. 基于多图流形排序的图像显著性检测[J]. 自动化学报, 2019, 45(3): 577-592. |
YU M, LI B Z, YU Y, et al. Image saliency detection with multi-graph model and manifold ranking[J]. Acta Automa-tica Sinica, 2019, 45(3): 577-592. | |
[16] | 常振, 段先华, 鲁文超, 等. 基于多尺度的贝叶斯模型显著性检测[J]. 计算机工程与应用, 2020, 56(11): 207-213. |
CHANG Z, DUAN X H, LU W C, et al. Multi-scale salie-ncy detection based on Bayesian framework[J]. Computer Engineering and Applications, 2020, 56(11): 207-213. | |
[17] | 卢珊妹, 郭强, 王任, 等. 基于多特征注意力循环网络的显著性检测[J]. 计算机辅助设计与图形学学报, 2020, 32(12): 1926-1937. |
LU S M, GUO Q, WANG R, et al. Salient object detection using multi-scale features with attention recurrent mechanism[J]. Journal of Computer-Aided Design & Computer Graphics, 2020, 32(12): 1926-1937. | |
[18] | PEI J L, TANG H, LIU C, et al. Salient instance segmen-tation via subitizing and clustering[J]. Neurocomputing, 2020, 402: 423-436. |
[19] | KITTLER J. On the accuracy of the Sobel edge detector[J]. Image & Vision Computing, 1983, 1(1): 37-42. |
[20] | CANNY J. A computational approach to edge detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelli-gence, 1986, 8(6): 679-698. |
[21] | ARBELÁEZ P, MAIRE M, FOWLKES C, et al. Contour detection and hierarchical image segmentation[J]. IEEE Tran-sactions on Pattern Analysis and Machine Intelligence, 2011, 33(5): 898-916. |
[22] | XIE S, TU Z. Holistically-nested edge detection[J]. Interna-tional Journal of Computer Vision, 2017, 125(5): 3-18. |
[23] | YANG J, PRICE B, COHEN S, et al. Object contour detection with a fully convolutional encoder-decoder network[C]// Procee-dings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 193-202. |
[24] | 董子昊, 邵秀丽. 多类别的边缘感知方法在图像分割中的应用[J]. 计算机辅助设计与图形学学报, 2019, 31(7): 1075-1085. |
DONG Z H, SHAO X L. A multi-category edge perception method for semantic segmentation[J]. Journal of Computer-Aided Design & Computer Graphics, 2019, 31(7): 1075-1085. | |
[25] | 钱宝鑫, 肖志勇, 宋威. 改进的卷积神经网络在肺部图像上的分割应用[J]. 计算机科学与探索, 2020, 14(8): 1358-1367. |
QIAN B X, XIAO Z Y, SONG W. Application of improved convolutional neural network in lung image segmentation[J]. Journal of Frontiers of Computer Science and Techno-logy, 2020, 14(8): 1358-1367. | |
[26] | WANG P Q, CHEN P F, YUAN Y, et al. Understanding convo-lution for semantic segmentation[C]// Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, Lake Tahoe, Mar 12-15, 2018. Washington: IEEE Computer Society, 2018: 1451-1460. |
[27] | GIRSHICK R. Fast R-CNN[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision, Santiago, Dec 13-16, 2015. Washington: IEEE Computer Society, 2015: 1440-1448. |
[1] | 谷鹏辉, 肖志勇. 改进的U-Net在视网膜血管分割上的应用[J]. 计算机科学与探索, 2022, 16(3): 683-691. |
[2] | 王胜,解辉,张福泉. 利用边缘检测与Zernike矩的半脆弱图像水印算法[J]. 计算机科学与探索, 2018, 12(4): 629-641. |
[3] | 陈思汉,余建波. 基于二维局部均值分解的图像边缘检测算法[J]. 计算机科学与探索, 2016, 10(6): 847-855. |
[4] | 郭延祥,陈耀武. 基于边缘检测和颜色纹理直方图的车牌定位方法[J]. 计算机科学与探索, 2014, 8(6): 719-726. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||