Journal of Frontiers of Computer Science and Technology ›› 2022, Vol. 16 ›› Issue (4): 938-949.DOI: 10.3778/j.issn.1673-9418.2010031
• Graphics and Image • Previous Articles Next Articles
Received:
2020-10-12
Revised:
2021-01-07
Online:
2022-04-01
Published:
2021-02-04
About author:
LI Kuankuan, born in 1995, M.S. candidate. His research interests include image processing and computer vision.Supported by:
通讯作者:
+ E-mail: liulib@163.com作者简介:
李宽宽(1995—),男,河北石家庄人,硕士研究生,主要研究方向为图像处理、计算机视觉。基金资助:
CLC Number:
LI Kuankuan, LIU Libo. Fine-Grained Image Classification Model Based on Bilinear Aggregate Residual Attention[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(4): 938-949.
李宽宽, 刘立波. 双线性聚合残差注意力的细粒度图像分类模型[J]. 计算机科学与探索, 2022, 16(4): 938-949.
Add to citation manager EndNote|Ris|BibTeX
URL: http://fcst.ceaj.org/EN/10.3778/j.issn.1673-9418.2010031
Datasets | Category | Training | Testing |
---|---|---|---|
CUB-200-2011 | 200 | 5 994 | 5 794 |
FGVC-Aircraft | 100 | 6 667 | 3 333 |
Stanford Cars | 196 | 8 144 | 8 041 |
Table 1 Datasets information of training and testing
Datasets | Category | Training | Testing |
---|---|---|---|
CUB-200-2011 | 200 | 5 994 | 5 794 |
FGVC-Aircraft | 100 | 6 667 | 3 333 |
Stanford Cars | 196 | 8 144 | 8 041 |
Datasets | cnums/cgroups |
---|---|
CUB-200-2011 | 2/88 3/112 |
FGVC-Aircraft | 5/100 6/2 |
Stanford Cars | 2/76 3/120 |
Table 2 ξvalue assignment using BARAN with 512 feature channels
Datasets | cnums/cgroups |
---|---|
CUB-200-2011 | 2/88 3/112 |
FGVC-Aircraft | 5/100 6/2 |
Stanford Cars | 2/76 3/120 |
Method | Base model | Params/106 | Accuracy/% |
---|---|---|---|
B-CNN[M,D] | VGG16-M+VGG16-D | 13.8 | 84.1 |
BARN[2×64d] | ResneXt29×2+SA | 34.8 | 84.8 |
BARN[4×64d] | ResneXt29×2+SA | 34.6 | 85.2 |
BARN[8×64d] | ResneXt29×2+SA | 34.4 | 85.5 |
BARN[32×4d] | ResneXt29×2+SA | 18.2 | 85.9 |
Table 3 Experimental comparison between SA module and ResneXt under different cardinality
Method | Base model | Params/106 | Accuracy/% |
---|---|---|---|
B-CNN[M,D] | VGG16-M+VGG16-D | 13.8 | 84.1 |
BARN[2×64d] | ResneXt29×2+SA | 34.8 | 84.8 |
BARN[4×64d] | ResneXt29×2+SA | 34.6 | 85.2 |
BARN[8×64d] | ResneXt29×2+SA | 34.4 | 85.5 |
BARN[32×4d] | ResneXt29×2+SA | 18.2 | 85.9 |
Method | Base model | Accuracy/% | ||
---|---|---|---|---|
CUB-200-2011 | FGVC-Aircraft | Stanford Cars | ||
BARN+MCA (CWA) | ResneXt29×2+SA | 63.85 | 88.79 | 89.87 |
BARN+MCA | Resnext29×2+SA | 27.35 | 79.88 | 70.23 |
BARN+MCA | ResneXt29×2+SA | 65.07 | 88.28 | 90.04 |
BARN+MCA | ResneXt29×2+SA | 66.47 | 89.90 | 91.34 |
Table 4 Ablation experiment of different components of MCA module
Method | Base model | Accuracy/% | ||
---|---|---|---|---|
CUB-200-2011 | FGVC-Aircraft | Stanford Cars | ||
BARN+MCA (CWA) | ResneXt29×2+SA | 63.85 | 88.79 | 89.87 |
BARN+MCA | Resnext29×2+SA | 27.35 | 79.88 | 70.23 |
BARN+MCA | ResneXt29×2+SA | 65.07 | 88.28 | 90.04 |
BARN+MCA | ResneXt29×2+SA | 66.47 | 89.90 | 91.34 |
Method | Base model | Accuracy/% | ||
---|---|---|---|---|
CUB-200-2011 | FGVC-Aircraft | Stanford Cars | ||
B-CNN[ | VGG16 | 84.1 | 84.1 | 91.3 |
MaxEnt[ | B-CNN | 85.3 | 86.1 | 92.8 |
PC[ | B-CNN | 85.6 | 85.8 | 92.5 |
PC[ | DenseNet161 | 86.9 | 89.2 | 92.9 |
MA-CNN[ | VGG19 | 86.5 | 89.9 | 92.8 |
DFL-CNN[ | ResNet50 | 87.4 | 91.7 | 93.9 |
NTS-Net[ | ResNet50 | 87.5 | 91.4 | 93.9 |
TASN[ | ResNet50 | 87.9 | — | 93.8 |
DCL[ | VGG16 | 86.9 | 91.2 | 94.1 |
WPS-CPM[ | GoogleNet + ResNet50 | 90.4 | — | — |
Bi-Modal PMA[ | ResNet50 | 87.5 | 90.8 | 93.1 |
BARAN(Proposed) | B-CNN+ResneXt29 | 87.9 | 92.9 | 94.7 |
Table 5 Experimental comparison of different weakly supervised fine-grained image classification methods
Method | Base model | Accuracy/% | ||
---|---|---|---|---|
CUB-200-2011 | FGVC-Aircraft | Stanford Cars | ||
B-CNN[ | VGG16 | 84.1 | 84.1 | 91.3 |
MaxEnt[ | B-CNN | 85.3 | 86.1 | 92.8 |
PC[ | B-CNN | 85.6 | 85.8 | 92.5 |
PC[ | DenseNet161 | 86.9 | 89.2 | 92.9 |
MA-CNN[ | VGG19 | 86.5 | 89.9 | 92.8 |
DFL-CNN[ | ResNet50 | 87.4 | 91.7 | 93.9 |
NTS-Net[ | ResNet50 | 87.5 | 91.4 | 93.9 |
TASN[ | ResNet50 | 87.9 | — | 93.8 |
DCL[ | VGG16 | 86.9 | 91.2 | 94.1 |
WPS-CPM[ | GoogleNet + ResNet50 | 90.4 | — | — |
Bi-Modal PMA[ | ResNet50 | 87.5 | 90.8 | 93.1 |
BARAN(Proposed) | B-CNN+ResneXt29 | 87.9 | 92.9 | 94.7 |
[1] | ZHANG N, DONAHUE J, GIRSHICK R B, et al. Part-based R-CNNs for fine-grained category detection[C]// LNCS 8689: Proceedings of the 13th European Conference on Computer Vision, Zurich, Sep 6-12, 2014. Cham: Springer, 2014: 834-849. |
[2] | 罗建豪, 吴建鑫. 基于深度卷积特征的细粒度图像分类研究综述[J]. 自动化学报, 2017, 43(8):1306-1318. |
LUO J H, WU J X. A survey on fine-grained image cate-gorization using deep convolutional features[J]. Acta Auto-matica Sinica, 2017, 43(8):1306-1318. | |
[3] |
UIJLINGS J R R, VAN DE SANDE K E A, GEVERS T, et al. Selective search for object recognition[J]. International Journal of Computer Vision, 2013, 104(2):154-171.
DOI URL |
[4] | LIN D, SHEN X Y, LU C W, et al. Deep LAC: deep localiza-tion, alignment and classification for fine-grained recognition[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, Jun 7-12, 2015. Washington: IEEE Computer Society, 2015: 1666-1674. |
[5] | YANG Z, LUO T G, WANG D, et al. Learning to navigate for fine-grained classification[C]// LNCS 11218: Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 420-435. |
[6] |
BORJI A, ITTI L. State-of-the-art in visual attention modeling[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1):185-207.
DOI URL |
[7] |
PENG Y H, HE X T, ZHAO J J. Object-part attention model for fine-grained image classification[J]. IEEE Transactions on Image Processing, 2018, 27(3):1487-1500.
DOI URL |
[8] | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// LNCS 11211: Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 3-19. |
[9] | HAN K, GUO J Y, ZHANG C, et al. Attribute-aware attention model for fine-grained representation learning[C]// Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, Seoul, Oct 22-26, 2018. New York: ACM, 2018: 2040-2048. |
[10] | GAO Y, HAN X T, WANG X, et al. Channel interaction networks for fine-grained image categorization[C]// Procee-dings of the 34th AAAI Conference on Artificial Intelligence, the 32nd Innovative Applications of Artificial Intelligence Conference, the 10th AAAI Symposium on Educational Advances in Artificial Intelligence, New York, Feb 7-12, 2020. Menlo Park: AAAI, 2020: 10818-10825. |
[11] |
HU J, SHEN L, ALBANIE S, et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8):2011-2023.
DOI URL |
[12] | LI X, WANG W H, HU X L, et al. Selective kernel networks[C]// Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 510-519. |
[13] | ZHANG H, WU C R, ZHANG Z Y, et al. ResNeSt: split-attention networks[J]. arXiv: 2004. 08955, 2020. |
[14] |
CHANG D L, DING Y F, XIE J Y, et al. The devil is in the channels: mutual-channel loss for fine-grained image classi-fication[J]. IEEE Transactions on Image Processing, 2020, 29:4683-4695.
DOI URL |
[15] | LIN T Y, ROYCHOWDHURY A, MAJI S. Bilinear CNNs for fine-grained visual recognition[J]. arXiv: 1504. 07889, 2015. |
[16] | XIE S N, GIRSHICK R B, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks[C]// Pro-ceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Wash-ington: IEEE Computer Society, 2017: 5987-5995. |
[17] | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recogni-tion, Las Vegas, Jun 27-30, 2016. Washington: IEEE Com-puter Society, 2016: 770-778. |
[18] | PASZKE A, GROSS S, CHINTALA S, et al. Automatic differentiation in PyTorch[C]// Proceedings of the 31st Con-ference on Neural Information Processing System, Long Beach, Oct 28, 2017. Red Hook: Curran Associates, 2017: 1-4. |
[19] | WAH C, BRANSON S, WELINDER P, et al. The Caltech-UCSD Birds-200-2011 dataset[R]. Pasadena: California Ins-titute of Technology, 2011. |
[20] | MAJI S, RAHTU E, KANNALA J, et al. Fine-grained visual classification of aircraft[J]. arXiv: 1306. 5151, 2013. |
[21] | KRAUSE J, STARK M, DENG J, et al. 3D object represen-tations for fine-grained categorization[C]// Proceedings of the 2013 IEEE International Conference on Computer Vision, Sydney, Dec 1-8, 2013. Washington: IEEE Computer Society, 2013: 554-561. |
[22] | DUBEY A, GUPTA O, RASKAR R, et al. Maximum-entropy fine grained classification[C]// Proceedings of the Annual Con-ference on Neural Information Processing Systems, Montréal, Dec 3-8, 2018: 635-645. |
[23] | DUBEY A, GUPTA O, GUO P, et al. Pairwise confusion for fine-grained visual classification[C]// LNCS 11216: Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 71-88. |
[24] | ZHENG H L, FU J L, MEI T, et al. Learning multi-attention convolutional neural network for fine-grained image reco-gnition[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 5219-5227. |
[25] | WANG Y M, MORARIU V I, DAVIS L S. Learning a disc-riminative filter bank within a CNN for fine-grained reco-gnition[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Washington: IEEE Computer Society, 2018: 4148-4157. |
[26] | ZHENG H L, FU J L, ZHA Z J, et al. Looking for the devil in the details: learning trilinear attention sampling network for fine-grained image recognition[C]// Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 5012-5021. |
[27] | CHEN Y, BAI Y L, ZHANG W, et al. Destruction and con-struction learning for fine-grained image recognition[C]// Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 5157-5166. |
[28] | GE W F, LIN X R, YU Y Z. Weakly supervised comple-mentary parts models for fine-grained image classification from the bottom up[C]// Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 3034-3043. |
[29] |
SONG K T, WEI X S, SHU X B, et al. Bi-modal progres-sive mask attention for fine-grained recognition[J]. IEEE Transactions on Image Processing, 2020, 29:7006-7018.
DOI URL |
[30] | SELVARAJU R R, COGSWELL M, DAS A, et al. GRAD-CAM: visual explanations from deep networks via gradient-based localization[C]// Proceedings of the 2017 IEEE Inter-national Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 618-626. |
[31] | 杨萌林, 张文生. 分类激活图增强的图像分类算法[J]. 计算机科学与探索, 2020, 14(1):149-158. |
YANG M L, ZHANG W S. Image classification algorithm based on classification activation map enhancement[J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(1):149-158. |
[1] | HE Li, ZHANG Hongyan, FANG Wanlin. Salient Instance Segmentation via Multiscale Boundary Characteristic Network [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(8): 1865-1876. |
[2] | HUANG Hao, GE Hongwei. Deep Residual Expression Recognition Network to Enhance Inter-class Discrimination [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(8): 1842-1849. |
[3] | PENG Hao, LI Xiaoming. Multi-scale Selection Pyramid Networks for Small-Sample Target Detection Algorithms [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(7): 1649-1660. |
[4] | LIU Yi, LI Mengmeng, ZHENG Qibin, QIN Wei, REN Xiaoguang. Survey on Video Object Tracking Algorithms [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(7): 1504-1515. |
[5] | LIU Yafen, ZHENG Yifeng, JIANG Lingyi, LI Guohe, ZHANG Wenjie. Survey on Pseudo-Labeling Methods in Deep Semi-supervised Learning [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(6): 1279-1290. |
[6] | LI Zhijie, YI Zhilin, LI Changhua, ZHANG Jie. Improved DF Model Applied to Inexact Graph Matching [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(6): 1383-1389. |
[7] | YANG Gang, ZHANG Yushu, SONG Zhen. Human Action Recognition and Evaluation—Differences, Connections and Research Progress [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(5): 991-1007. |
[8] | ZHONG Mengyuan, JIANG Lin. Review of Super-Resolution Image Reconstruction Algorithms [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(5): 972-990. |
[9] | HE Yaru, GE Hongwei. Image Segmentation Algorithm Combining Visual Salient Regions and Active Contour [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(5): 1155-1168. |
[10] | CHENG Weiyue, ZHANG Xueqin, LIN Kezheng, LI Ao. Deep Convolutional Neural Network Algorithm Fusing Global and Local Features [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(5): 1146-1154. |
[11] | OU Yangliu, HE Xi, QU Shaojun. Fully Convolutional Neural Network with Attention Module for Semantic Segmentation [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(5): 1136-1145. |
[12] | ZHAO Pengfei, XIE Linbo, PENG Li. Deep Small Object Detection Algorithm Integrating Attention Mechanism [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(4): 927-937. |
[13] | LIU Liping, SUN Jian, GAO Shiyan. Overview of Blind Deblurring Methods for Single Image [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(3): 552-564. |
[14] | CHENG Shilong, XIE Linbo, PENG Li. Gradient-Guided Object Tracking Algorithm with Channel Selection [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(3): 649-660. |
[15] | LI Kecen, WANG Xiaoqiang, LIN Hao, LI Leixiao, YANG Yanyan, MENG Chuang, GAO Jing. Survey of One-Stage Small Object Detection Methods in Deep Learning [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(1): 41-58. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
/D:/magtech/JO/Jwk3_kxyts/WEB-INF/classes/