Image Classification Algorithm Based on Classification Activation Map Enhancement

doi:10.3778/j.issn.1673-9418.1902025

Abstract

Abstract: Classification activation map (CAM) has problems such as sparseness, discontinuity, incompleteness, etc.,and most of the current research only uses it for visual analysis. Based on this, this paper firstly utilizes the dilated convolution to design an automatic weighted multi-scale feature learning method in order to compensate for the defects of CAM and combines the multi-scale feature with the generation method of CAM to develop a multi-scale CAM generation method. Further, this paper embeds the multi-scale CAM into the network to form an end-to-end structure in order to enhance the classification performance. Taking the ResNet as the backbone, this paper proposes a classification enhancement model, ResNet-CE. Extensively experiments are conducted with ResNet-CE on three publicly available datasets, CIFAR10, CIFAR100 and STL10. Experiments show that the classification performance of ResNet-CE on these three datasets is significantly improved compared with the ResNet with similar parameters quantity. The error rates are reduced by 0.23%, 3.56% and 7.96%, respectively and the classification performance is better than most mainstream classification models. The proposed model can be easily transferred to the off-the-shelf model to improve its classification performance. At the same time, the algorithm retains the function of visualization and interpretation of the judgment of the model, which has certain application value and significance in scenes, such as diseases recognition in medical image and scene recognition in unmanned driving, etc.

Key words: image classification, classification activation map (CAM), multiscale, visualization, interpretability

摘要： 分类激活图（CAM）具有稀疏、不连续、不完整等问题，并且目前大部分研究仅将其用于可视化分析。基于此，首先利用扩张卷积设计了自动加权的多尺度特征学习来弥补分类激活图存在的问题，并将该多尺度特征与分类激活图生成方法结合，设计了多尺度分类激活图生成方法。进一步，将该多尺度的分类激活图嵌入到网络中构成了端到端的结构，实现分类性能增强的目的。以残差网络ResNet为骨干网络，提出了分类增强模型ResNet-CE。在三个公开数据集CIFAR10、CIFAR100和STL10上，对该模型进行了大量的实验。实验表明：ResNet-CE在这三个数据集上的分类性能与参数量相当的ResNet相比有明显的提升，识别的错误率分别降低了0.23%、3.56%和7.96%，并且分类性能优于当前大部分的分类网络。提出的算法能够简单地迁移到已有的分类模型中，提高原有模型的分类性能。同时，该算法保留了对模型判断依据可视化和解释的功能，这在医疗影像中的疾病识别、无人驾驶的场景识别等场景中具有一定的应用价值和意义。

关键词: 图像分类, 分类激活图（CAM）, 多尺度, 可视化, 可解释性

YANG Menglin, ZHANG Wensheng. Image Classification Algorithm Based on Classification Activation Map Enhancement[J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(1): 149-158.

杨萌林，张文生. 分类激活图增强的图像分类算法[J]. 计算机科学与探索, 2020, 14(1): 149-158.

[1]	ZHANG Mengqian, ZHANG Li. Coarse-to-Fine Two-Stage Convolutional Neural Network Algorithm [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(8): 1501-1510.
[2]	LIU Jingyi, SHI Caijuan, TU Dongjing, LIU Shuai. Survey of Zero-Shot Image Classification [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(5): 812-824.
[3]	ZHENG Yafeng, ZHAO Yaning, BAI Xue, FU Qian. Survey of Big Data Visualization in Education [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(3): 403-422.
[4]	YANG Zhangjing, WANG Wenbo, HUANG Pu, ZHANG Fanlong. Denoising Latent Subspace Based Subspace Learning for Image Classification [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(12): 2374-2389.
[5]	AN Ping, JI Zhong, LIU Xiyao. Task-Aware Dual Prototypical Network for Few-Shot Human-Object Interaction Recognition [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(11): 2184-2192.
[6]	MA Xiang, DENG Zhaohong, WANG Shitong. Multi-grained Fusion Image Feature Learning with Fuzzy Rule System [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(1): 173-184.
[7]	SONG Yixuan, DENG Zhaohong, QIN Bin. Fuzzy Inference and Manifold Regularization Combined Feature Transfer Learning [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(3): 449-459.
[8]	WANG Xiaodong, ZHAO Yining, XIAO Haili, WANG Xiaoning, CHI Xuebin. Research on Anomaly Detection System of Online Multi-node Log Flow [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(11): 1828-1837.
[9]	TANG Shuang, ZHANG Lingxiao, ZHAO Junfeng, XIE Bing, ZOU Yanzhen. Extensible Topic Modeling and Analysis Framework for Multisource Data [J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(5): 742-752.
[10]	CHEN Deyun, FU Lijun, ZHANG Xuesong, YU Liang, CHEN Hailong, LI Ao. Multiple Representations for Image Classification Approaches [J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(12): 2138-2148.
[11]	CAO Ya, DENG Zhaohong, WANG Shitong. TSK Fuzzy System Model with Monotonic Constraints [J]. Journal of Frontiers of Computer Science and Technology, 2018, 12(9): 1487-1495.
[12]	ZHANG Sheng, ZHAO Jue, CHEN Rongyuan. Research Advances on Network Security Logs Visualization [J]. Journal of Frontiers of Computer Science and Technology, 2018, 12(5): 681-696.
[13]	ZHANG Dakun, REN Shuxia. Survey on Hypergraph Visualization Method [J]. Journal of Frontiers of Computer Science and Technology, 2018, 12(11): 1701-1717.
[14]	WANG Xiyang, CHENG Chunling, CHEN Xingguo. Global Adaptive Isometric Mapping Algorithm for Visualization [J]. Journal of Frontiers of Computer Science and Technology, 2017, 11(7): 1092-1101.
[15]	WANG Yi, REN Shuxia. Survey on Visualization of Medical Big Data [J]. Journal of Frontiers of Computer Science and Technology, 2017, 11(5): 681-699.

Image Classification Algorithm Based on Classification Activation Map Enhancement

分类激活图增强的图像分类算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles 0

Metrics