[1] HEO B, KIM J, YUN S, et al. A comprehensive over-haul of feature distillation[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway: IEEE, 2019: 1921-1930.
[2] ZAGORUYKO S, KOMODAKIS N. Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer[J]. arXiv:1612.03928, 2016.
[3] ROMERO A, BALLAS N, KAHOU S E, et al. Fitnets: hints for thin deep nets[J]. arXiv:1412.6550, 2014.
[4] 孟宪法, 刘方, 李广, 等. 卷积神经网络压缩中的知识蒸馏技术综述[J]. 计算机科学与探索, 2021, 15(10): 1812-1829.
MENG X F, LIU F, LI G, et al. Review of knowledge dis-tillation in convolutional neural network compression[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(10): 1812-1829.
[5] 耿丽丽, 牛保宁. 深度神经网络模型压缩综述[J]. 计算机科学与探索, 2020, 14(9): 1441-1455.
GENG L L, NIU B N. Survey of deep neural networks model compression[J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(9): 1441-1455.
[6] CHEN H T, WANG Y H, XU C, et al. Data-free learning of student networks[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway: IEEE, 2019: 3513-3521.
[7] FANG G F, SONG J, WANG X C, et al. Contrastive model in-version for data-free knowledge distillation[J]. arXiv:2105.08584, 2021.
[8] NAYAK G K, MOPURI K R, SHAJ V, et al. Zero-shot know-ledge distillation in deep networks[C]//Proceedings of the 36th International Conference on Machine Learning, Long Beach, Jun 9-15, 2019: 4743-4751.
[9] CHOI Y, CHOI J, EL-KHAMY M, et al. Data-free network quantization with adversarial knowledge distillation[C]//Pro-ceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 14-19, 2020. Pis-cataway: IEEE, 2020: 3047-3057.
[10] YIN H X, MOLCHANOV P, ALVAREZ J M, et al. Dreaming to distill: data-free knowledge transfer via deep inversion[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 8712-8721.
[11] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recogni-tion, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 770-778.
[12] BA J, CARUANA R. Do deep nets really need to be deep?[C]//Advances in Neural Information Processing Systems 27, Montreal, Dec 8-13, 2014: 1-9.
[13] JUNG S, LEE D, PARK T, et al. Fair feature distillation for visual recognition[C]//Proceedings of the 2021 IEEE Conference on Computer Vision and Pattern Recognition, Jun 19-25, 2021. Piscataway: IEEE, 2021: 12115-12124.
[14] LAN X, ZHU X T, GONG S G. Knowledge distillation by on-the-fly native ensemble[C]//Advances in Neural Information Processing Systems 31, Montréal, Dec 3-8, 2018: 7528-7538.
[15] XIANG L Y, DING G G, HAN J G. Learning from multiple experts: self-paced knowledge distillation for long-tailed classi-fication[C]//LNCS 12350: Proceedings of the 16th European Conference on Computer Vision, Glasgow, Aug 23-28, 2020. Cham: Springer, 2020: 247-263.
[16] LIU I J, PENG J, SCHWING A G. Knowledge flow: improve upon your teachers[J]. arXiv:1904.05878, 2019.
[17] YOU S, XU C, XU C, et al. Learning from multiple teacher networks[C]//Proceedings of the 23rd ACM SIGKDD Inter-national Conference on Knowledge Discovery and Data Mining, Halifax, Aug 13-17, 2017. New York: ACM, 2017: 1285-1294.
[18] ZHOU P, MAI L, ZHANG J M, et al. M2kd: multi-model and multi-level knowledge distillation for incremental learning[J]. arXiv:1904.01769, 2019.
[19] LOPES R G, FENU S, STARNER T. Data-free knowledge distillation for deep neural networks[J]. arXiv:1710.07535, 2017.
[20] LIU Y, ZHANG W, WANG J. Zero-shot adversarial quanti-zation[C]//Proceedings of the 2021 IEEE Conference on Com-puter Vision and Pattern Recognition, Jun 19-25, 2021. Pis-cataway: IEEE, 2021: 1512-1521.
[21] CAI Y H, YAO Z W, DONG Z, et al. ZeroQ: a novel zero shot quantization framework[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Re-cognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 13166-13175.
[22] HAROUSH M, HUBARA I, HOFFER E, et al. The knowledge within: methods for data-free model compression[C]//Procee-dings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscata-way: IEEE, 2020: 8491-8499.
[23] LI Y H, ZHU F, GONG R H, et al. MixMix: all you need for data-free compression are feature and data mixing[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Oct 10-17, 2021. Piscataway: IEEE, 2021: 4390-4399.
[24] CHEN T, KORNBLITH S, NOROUZI M, et al. A simple framework for contrastive learning of visual representations[C]//Proceedings of the 37th International Conference on Ma-chine Learning, Jul 13-18, 2020: 1597-1607.
[25] HADSELL R, CHOPRA S, LECUN Y. Dimensionality reduc-tion by learning an invariant mapping[C]//Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, Jun 17-22, 2006. Washington: IEEE Computer Society, 2006: 1735-1742.
[26] HE K M, FAN H Q, WU Y X, et al. Momentum contrast for unsupervised visual representation learning[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 9726-9735.
[27] VAN DEN OORD A, LI Y, VINYALS O. Representation learning with contrastive predictive coding[J]. arXiv:1807.03748, 2018.
[28] WU Z R, XIONG Y J, YU S X, et al. Unsupervised feature learning via non-parametric instance discrimination[C]//Pro-ceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-23,2018. Piscataway: IEEE, 2018: 3733-3742.
[29] TIAN Y L, KRISHNAN D, ISOLA P. Contrastive represen-tation distillation[J]. arXiv:1910.10699, 2019.
[30] XU G D, LIU Z W, LI X X, et al. Knowledge distillation meets self-supervision[C]//LNCS 12354: Proceedings of the 16th European Conference on Computer Vision, Glasgow, Aug 23-28, 2020. Cham: Springer, 2020: 588-604.
[31] ZHU J G, TANG S X, CHEN D P, et al. Complementary relation contrastive distillation[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Re-cognition, Jun 19-25, 2021. Piscataway: IEEE, 2021: 9260-9269.
[32] FANG G F, SONG J, SHEN C C, et al. Data-free adversarial distillation[J]. arXiv:1912.11006, 2019.
[33] MICAELLI P, STORKEY A J. Zero-shot knowledge transfer via adversarial belief matching[J]. arXiv:1905.09768, 2019.
[34] ALLEN-ZHU Z, LI Y Z. Towards understanding ensemble, knowledge distillation and self-distillation in deep learning[J]. arXiv:2012.09816, 2020.
[35] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. arXiv:1409.1556, 2014.
[36] ZAGORUYKO S, KOMODAKIS N. Wide residual networks[J]. arXiv:1605.07146, 2016.
[37] SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 2818-2826.
[38] SANDLER M, HOWARD A, ZHU M, et al. MobileNetv2: inverted residuals and linear bottlenecks[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Washington:IEEE Computer Society, 2018: 4510-4520.
[39] LI F F, ROB F, PIETRO P. Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories[C]//Proceedings of the 2004 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2004: 178-178.
[40] VAN DER MAATEN L, HINTON G. Visualizing data using t-SNE[J]. Journal of Machine Learning Research, 2008, 9(11): 2580-2605. |