Review of Image Data Augmentation in Computer Vision

doi:10.3778/j.issn.1673-9418.2102015

Abstract

Abstract:

Deep learning is a promising solution for computer vision at present. To solve the computer vision problem, it requires massive and high-quality image training datasets. Collecting and accurately labeling image datasets is a very time-consuming and expensive process. As computer vision applications become more widespread, it makes this problem even more pronounced. Image augmentation technologies are technical methods to effectively solve the problem of deep learning training under the condition of small-scale or low-quality training data. These technologies are continually accompanied with the development of deep learning and computer vision. This paper first reviews these image augmentation researches from the perspective of augmentation objects, operation spaces, label processing methods, and augmentation strategies and then concludes corresponding paradigms of current image data augmentation methods. After that, this paper proposes a taxonomy for current image data augmentation guided by the above paradigms, and reviews corresponding representative methods of each image data augmentation category. Finally, this paper makes conclusions on existing image data augmentation, points out the problems existing in the current image augmentation research and presents promising directions for future research.

Key words: deep learning, computer vision, image augmentation, data augmentation, image enhancement

摘要：

深度学习是目前机器视觉的前沿解决方案，而海量高质量的训练数据集是深度学习解决机器视觉问题的基本保障。收集和准确标注图像数据集是一个极其费时且代价昂贵的过程。随着机器视觉的广泛应用，这个问题将会越来越突出。图像增广技术是一种有效解决深度学习在少量或者低质量训练数据中进行训练的一种技术手段，该技术不断地伴随着深度学习与机器视觉的发展。系统性梳理当前图像增广技术研究，从增广对象、增广空间、标签处理和增广策略生成的角度，分析现有图像增广技术的研究范式。依据研究范式提出现有图像增广技术的分类系统，重点介绍每类图像增广研究的代表性研究成果。最后，对现有图像增广研究进行总结，指出当前图像增广研究中存在的问题及未来的发展趋势。

关键词: 深度学习, 计算机视觉, 图像增广, 数据增广, 图像增强

LIN Chengchuang, SHAN Chun, ZHAO Gansen, et al. Review of Image Data Augmentation in Computer Vision[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(4): 583-611.

林成创, 单纯, 赵淦森, 等. 机器视觉应用中的图像数据增广综述[J]. 计算机科学与探索, 2021, 15(4): 583-611.

References

[1] GARCIA GARCIA A, ORTS ESCOLANO S, OPREA S, et al. A review on deep learning techniques applied to semantic segmentation[J]. arXiv:1704.06857, 2017.
[2] WANG Y, ZHANG H J, HUANG H X. A survey of image semantic segmentation algorithms based on deep learning[J]. Application of Electronic Technique, 2019, 45(6): 23-27.
王宇, 张焕君, 黄海新. 基于深度学习的图像语义分割算法综述[J]. 电子技术应用, 2019, 45(6): 23-27.
[3] TIAN X, WANG L, DING Q. Review of image semantic segmentation based on deep learning[J]. Journal of Software, 2019, 30(2): 440-468.
田萱, 王亮, 丁琪. 基于深度学习的图像语义分割方法综述[J]. 软件学报, 2019, 30(2): 440-468.
[4] HINTON G E, SALAKHUTDINOV R R. Reducing the   dimensionality of data with neural networks[J]. Science, 2006, 313(5786): 504-507.
[5] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Image-net classification with deep convolutional neural networks[C]//Proceedings of the 26th Annual Conference on Neural Information Processing Systems, Lake Tahoe, Dec 3-6, 2012. Red Hook: Curran Associates, 2012: 1106-1114.
[6] LIN M, CHEN Q, YAN S C. Network in network[C]//Proceedings of the 2nd International Conference on Learning Representations, Banff, Apr 14-16, 2014: 1-10.
[7] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. arXiv:1409. 1556, 2014.
[8] VISIN F, KASTNER K, CHO K, et al. A recurrent neural network based alternative to convolutional networks[J]. arXiv:1505.00393, 2015.
[9] SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, Jun 7-12, 2015. Washington: IEEE Computer Society, 2015: 1-9.
[10]   HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 770-778.
[11]   SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 2818-2826.
[12]   SZEGEDY C, IOFFE S, VANHOUCKE V. Inception-v4, Inception-ResNet and the impact of residual connections on learning[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, Feb 4-9, 2017. Menlo Park: AAAI, 2017: 4278-4284.
[13]   HOWARD A G, ZHU M, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[J]. arXiv:1704.04861, 2017.
[14]   SANDLER M, HOWARD A G, ZHU M L, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Piscataway: IEEE, 2018: 4510-4520.
[15]   HOWARD A, SANDLER M, CHU G, et al. Searching for MobileNetV3[J]. arXiv:1905.02244, 2019.
[16]   HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 2261-2269.
[17]   TAN M, LE Q V. EfficientNet: rethinking model scaling for convolutional neural networks[J]. arXiv:1905.11946, 2019.
[18]   XIE S N, GIRSHICK R B, DOLLáR P, et al. Aggregated residual transformations for deep neural networks[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 5987-5995.
[19]   ZHANG H, WU C, ZHANG Z, et al. ResNeSt: split-attention networks[J]. arXiv:2004.08955, 2020.
[20]   SUN C, SHRIVASTAVA A, SINGH S, et al. Revisiting unreasonable effectiveness of data in deep learning era[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 843-852.
[21]   MA D A, TANG P, ZHAO L J, et al. Review of data augmentation for image in deep learning[J/OL]. Journal of Image and Graphics[2020-09-25]. http://www.cjig.cn/jig/ch/reader/view_abstract.aspx?file_no=202003150000002.
马岽奡, 唐娉, 赵理君, 等. 深度学习中的图像数据增广方法研究综述[J/OL]. 中国图象图形学报[2020-09-25]. http://www.cjig.cn/jig/ch/reader/view_abstract.aspx?file_no=202003150000002.
[22]   DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database[C]//Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, Jun 20-25, 2009. Washington: IEEE Computer Society, 2009: 248-255.
[23]   KRIZHEVSKY A. Learning multiple layers of features from tiny images: TR-2009[R]. Toronto: University of Toronto, 2009.
[24]   EVERINGHAM M, ESLAMI S A, VAN GOOL L, et al. The pascal visual object classes challenge: a retrospective[J]. International Journal of Computer Vision, 2015, 111(1): 98-136.
[25]   MOTTAGHI R, CHEN X J, LIU X B, et al. The role of context for object detection and semantic segmentation in the wild[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, Jun 23-28, 2014. Washington: IEEE Computer Society, 2014: 891-898.
[26]   CHEN X J, MOTTAGHI R, LIU X B, et al. Detect what you can: detecting and representing objects using holistic models and body parts[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, Jun 23-28, 2014. Washington: IEEE Computer Society, 2014: 1979-1986.
[27]   HARIHARAN B, ARBELáEZ P, BOURDEV L D, et al. Semantic contours from inverse detectors[C]//Proceedings of the 2011 IEEE International Conference on Computer Vision, Barcelona, Nov 6-13, 2011. Washington: IEEE Computer Society, 2011: 991-998.
[28]   LIN T Y, MAIRE M, BELONGIE S J, et al. Microsoft    coco: common objects in context[C]//LNCS 8693: Proceedings of the 13th European Conference on Computer Vision, Zurich, Sep 6-12, 2014. Cham: Springer, 2014: 740-755.
[29]   ROS G, SELLART L, MATERZYNSKA J, et al. The synthia dataset: a large collection of synthetic images for semantic segmentation of urban scenes[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 3234-3243.
[30]   CORDTS M, OMRAN M, RAMOS S, et al. The cityscapes dataset[C]//Proceedings of the CVPR Workshop on the Future of Datasets in Vision, Boston, 2015: 1-4.
[31]   CORDTS M, OMRAN M, RAMOS S, et al. The cityscapes dataset for semantic urban scene understanding[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 3213- 3223.
[32]   BROSTOW G J, SHOTTON J, FAUQUEUR J, et al. Segmentation and recognition using structure from motion point clouds[C]//LNCS 5302: Proceedings of the 10th European Conference on Computer Vision, Marseille, Oct 12-18, 2008. Berlin, Heidelberg: Springer, 2008: 44-57.
[33]   GEIGER A, LENZ P, STILLER C, et al. Vision meets robotics: the KITTI dataset[J]. The International Journal of Robotics Research, 2013, 32(11): 1231-1237.
[34]   PREST A, LEISTNER C, CIVERA J, et al. Learning object class detectors from weakly annotated video[C]//Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, Jun 16-21, 2012. Washington: IEEE Computer Society, 2012: 3282-3289.
[35]   SHEN X Y, HERTZMANN A, JIA J Y, et al. Automatic portrait segmentation for image stylization[J]. Computer Graphics Forum, 2016, 35(2): 93-102.
[36]   BELL S, UPCHURCH P, SNAVELY N, et al. Material recognition in the wild with the materials in context database[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, Jun 7-12, 2015. Washington: IEEE Computer Society, 2015: 3479-3487.
[37]   PERAZZI F, PONT-TUSET J, MCWILLIAMS B, et al. A benchmark dataset and evaluation methodology for video object segmentation[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 724-732.
[38]   PONT-TUSET J, PERAZZI F, CAELLES S, et al. The 2017 DAVIS challenge on video object segmentation[J]. arXiv:1704.00675, 2017.
[39]   GOULD S, FULTON R, KOLLER D. Decomposing a scene into geometric and semantically consistent regions[C]//Proceedings of the IEEE 12th International Conference on Computer Vision, Kyoto, Sep 27-Oct 4, 2009. Piscataway: IEEE, 2009: 1-8.
[40]   LIU C, YUEN J, TORRALBA A. Nonparametric scene parsing: label transfer via dense scene alignment[C]//Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, Jun 20-15, 2009. Washington: IEEE Computer Society, 2009: 1972-1979.
[41]   HUSSAIN Z, GIMENEZ F, YI D, et al. Differential data augmentation techniques for medical imaging classification tasks[C]//Proceedings of the American Medical Informatics Association Annual Symposium, Washington, Nov 4-8, 2017: 979.
[42]   LIN C C, ZHAO G S, YIN A H, et al. A novel chromosome cluster types identification method using ResNeXt WSL model[J]. Medical Image Analysis, 2021, 69: 101943.
[43]   MA R, TAO P, TANG H. Optimizing data augmentation for semantic segmentation on small-scale dataset[C]//Proceedings of the 2nd International Conference on Control and Computer Vision. New York: ACM, 2019: 77-81.
[44]   KOBAYASHI K, TSUJI J, NOTO M. Evaluation of data augmentation for image-based plant disease detection[C]//Proceedings of the 2018 IEEE International Conference on Systems, Man, and Cybernetics, Miyazaki, Oct 7-10, 2018. Piscataway: IEEE, 2018: 2206-2211.
[45]   HE T, ZHANG Z, ZHANG H, et al. Bag of tricks for image classification with convolutional neural networks[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 558-567.
[46]   ZOPH B, VASUDEVAN V, SHLENS J, et al. Learning transferable architectures for scalable image recognition[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Piscataway: IEEE, 2018: 8697-8710.
[47]   CUBUK E D, ZOPH B, MANE D, et al. AutoAugment: learning augmentation strategies from data[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 113-123.
[48]   YU X, WU X, LUO C, et al. Deep learning in remote sensing scene classification: a data augmentation enhanced convolutional neural network framework[J]. GIScience & Remote Sensing, 2017, 54(5): 741-758.
[49]   ZHANG H Y, CISSé M, DAUPHIN Y N, et al. mixup: beyond empirical risk minimization[C]//Proceedings of the 6th International Conference on Learning Representations, Vancouver, Apr 30-May 3, 2018: 1-13.
[50]   INOUE H. Data augmentation by pairing samples for images classification[J]. arXiv:1801.02929, 2018.
[51]   BERTHELOT D, CARLINI N, GOODFELLOW I J, et al. MixMatch: a holistic approach to semi-supervised learning[C]//Proceedings of the Annual Conference on Neural Information Processing Systems, Vancouver, Dec 8-14, 2019: 5050-5060.
[52]   WONG S C, GATT A, STAMATESCU V, et al. Understanding data augmentation for classification: when to warp?[C]//Proceedings of the 2016 International Conference on Digital Image Computing: Techniques and Applications, Gold Coast, Nov 30-Dec 2, 2016. Piscataway: IEEE, 2016: 1-6.
[53]   TOKOZUME Y, USHIKU Y, HARADA T. Between-class learning for image classification[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Piscataway: IEEE, 2018: 5486-5494.
[54]   VERMA V, LAMB A, BECKHAM C, et al. Manifold mixup: encouraging meaningful on-manifold interpolation as a regularizer[J]. arXiv:1806.05236, 2018.
[55]   ARAZO E, ORTEGO D, ALBERT P, et al. Unsupervised label noise modeling and loss correction[J]. arXiv:1904. 11238, 2019.
[56]   YAGUCHI Y, SHIRATANI F, IWAKI H. Mixfeat: mix feature in latent space learns discriminative space[EB/OL].[2020-09-25]. https://openreview.net/forum?id=HygT9oRqFX.
[57]   LIANG D J, YANG F, ZHANG T, et al. Understanding mixup training methods[J]. IEEE Access, 2018, 6: 58774-58783.
[58]   GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, Dec 8-13, 2014. Red Hook: Curran Associates, 2014: 2672-2680.
[59]   LENG B, YU K, QIN J Y. Data augmentation for unbalanced face recognition training sets[J]. Neurocomputing, 2017, 235: 10-14.
[60]   ZHU X Y, LIU Y F, LI J H, et al. Emotion classification with data augmentation using generative adversarial networks[C]//LNCS 10939: Proceedings of the 22nd Pacific-Asia Conference on Knowledge Discovery and Data Mining, Melbourne, Jun 3-6, 2018. Cham: Springer, 2018: 349-360.
[61]   FRID-ADAR M, DIAMANT I, KLANG E, et al. GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification[J]. Neurocomputing, 2018, 321: 321-331.
[62]   PEREZ L, WANG J. The effectiveness of data augmentation in image classification using deep learning[J]. arXiv:1712.04621, 2017.
[63]   ANTONIOU A, STORKEY A, EDWARDS H. Data augmentation generative adversarial networks[J]. arXiv:1711. 04340, 2017.
[64]   FAWZI A, SAMULOWITZ H, TURAGA D S, et al. Adaptive data augmentation for image classification[C]//Proceedings of the 2016 IEEE International Conference on Image Processing, Phoenix, Sep 25-28, 2016. Piscataway: IEEE, 2016: 3688-3692.
[65]   LEMLEY J, BAZRAFKAN S, CORCORAN P. Smart augmentation learning an optimal data augmentation strategy[J]. IEEE Access, 2017, 5: 5858-5869.
[66]   RATNER A J, EHRENBERG H R, HUSSAIN Z, et al. Learning to compose domain-specific transformations for data augmentation[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, Dec 4-9, 2017. Red Hook: Curran Associates, 2017: 3236-3246.
[67]   TRAN T, PHAM T, CARNEIRO G, et al. A Bayesian data augmentation approach for learning deep models[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, Dec 4-9, 2017. Red Hook: Curran Associates, 2017: 2797-2806.
[68]   GENG M, XU K, DING B, et al. Learning data augmentation policies using augmented random search[J]. arXiv:1811.04768, 2018.
[69]   LIN C, GUO M, LI C M, et al. Online hyper-parameter learning for auto-augmentation strategy[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway: IEEE, 2019: 6579-6588.
[70]   HO D, LIANG E, STOICA I, et al. Population based augmentation: efficient learning of augmentation policy schedules[J]. arXiv:1905.05393, 2019.
[71]   SPINELLI I, SCARDAPANE S, SCARPINITI M, et al. Efficient data augmentation using graph imputation neural networks[J]. arXiv:1906.08502, 2019.
[72]   DEVRIES T, TAYLOR G W. Improved regularization of convolutional neural networks with cutout[J]. arXiv:1708. 04552, 2017.
[73]   ZHANG L, ZHAO J Y, YE X L, et al. Co-operative generative adversarial nets[J]. Acta Automatica Sinica, 2018, 44(5): 804-810.
张龙, 赵杰煜, 叶绪伦, 等. 协作式生成对抗网络[J]. 自动化学报, 2018, 44(5): 804-810.
[74]   HUANG S, ZHONG Z, JIN L, et al. Dropregion training of inception font network for high-performance Chinese font recognition[J]. Pattern Recognition, 2018, 77: 395-411.
[75]   ZHAO C Y, ZHENG Y G, WANG X K. Fuzzy enhancement algorithm based on histogram[J]. Computer Engineering, 2005, 31(12): 185-186.
赵春燕, 郑永果, 王向葵. 基于直方图的图像模糊增强算法[J]. 计算机工程, 2005, 31(12): 185-186.
[76]   ZHONG Z, ZHENG L, KANG G, et al. Random erasing data augmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Menlo Park: AAAI, 2020: 13001-13008.
[77]   ZHANG X X, YANG Y M. Common color space and its conversions in color image project[J]. Computer Engineering and Design, 2008, 29(5): 1210-1212.
张学习, 杨宜民. 彩色图像工程中常用颜色空间及其转换[J]. 计算机工程与设计, 2008, 29(5): 1210-1212.
[78]   GALDRAN A, ALVAREZ-GILA A, MEYER M I, et al. Data-driven color augmentation techniques for deep skin image analysis[J]. arXiv:1703.03702, 2017.
[79]   WANG H, ZHANG Y, SHEN H H, et al. Review of image enhancement algorithms[J]. Chinese Journal of Optics, 2017, 10(4): 438-448.
王浩, 张叶, 沈宏海, 等. 图像增强算法综述[J]. 中国光学, 2017, 10(4): 438-448.
[80]   BHAT P, CURLESS B, COHEN M F, et al. Fourier analysis of the 2d screened poisson equation for gradient domain problems[C]//LNCS 5303: Proceedings of the 10th European Conference on Computer Vision, Marseille, Oct 12-18, 2008. Berlin, Heidelberg: Springer, 2008: 114-128.
[81]   GONZALES A M, GRIGORYAN A M. Fast Retinex for color image enhancement: methods and algorithms[C]//SPIE 9411: Mobile Devices and Multimedia: Enabling Technologies, Algorithms, and Applications 2015. San Francisco: SPIE, 2015.
[82]   SHEN C T, HWANG W L. Color image enhancement using Retinex with robust envelope[C]//Proceedings of the 2009 16th IEEE International Conference on Image Processing, Cairo, Nov 7-10, 2009. Piscataway: IEEE, 2009: 3141-3144.
[83]   LI X M. Image enhancement algorithm based on Retinex theory[J]. Application Research of Computers, 2005, 22(2): 235-237.
李学明. 基于 Retinex 理论的图像增强算法[J]. 计算机应用研究, 2005, 22(2): 235-237.
[84]   XU L, CHEN X C. A novel method for image enhancement of medical images based on wavelet phase filter and nonlinear human visual properties[J]. Acta Electronica Sinica, 1999, 27(9): 121-123.
许雷, 陈兴灿. 一种基于小波相位滤波及视觉非线性的医学图像自适应增强新方法[J]. 电子学报, 1999, 27(9): 121-123.
[85]   XIE F Y, TANG M, ZHANG R. Review of image enhancement algorithms based on Retinex[J]. Journal of Data Acquisition and Processing, 2019, 34(1): 1-11.
谢凤英, 汤萌, 张蕊. 基于 Retinex 的图像增强方法综述[J]. 数据采集与处理, 2019, 34(1): 1-11.
[86]   JUNG A B, WADA K, CRALL J, et al. Imgaug[EB/OL].[2020-09-25]. https://github. com/aleju/imgaug.
[87]   BUSLAEV A, IGLOVIKOV V I, KHVEDCHENYA E, et al. Albumentations: fast and flexible image augmentations[J]. Information, 2020, 11(2): 125.
[88]   GUO H, MAO Y, ZHANG R. Mixup as locally linear out-of-manifold regularization[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Honolulu, Jan 27-Feb 1, 2019. Menlo Park: AAAI, 2019: 3714-3722.
[89]   HARRIS E, MARCU A, PAINTER M, et al. Understanding and enhancing mixed sample data augmentation[J]. arXiv:2002.12047, 2020.
[90]   TAKAHASHI R, MATSUBARA T, UEHARA K. Ricap: random image cropping and patching data augmentation for deep CNNs[C]//Proceedings of the 10th Asian Conference on Machine Learning, Beijing, Nov 14-16, 2018: 786-798.
[91]   YUN S, HAN D, OH S J, et al. CutMix: regularization strategy to train strong classifiers with localizable features[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway: IEEE, 2019: 6023-6032.
[92]   DWIBEDI D, MISRA I, HEBERT M. Cut, paste and learn: surprisingly easy synthesis for instance detection[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 1301-1310.
[93]   SUMMERS C, DINNEEN M J. Improved mixed-example data augmentation[C]//Proceedings of the 2019 IEEE Winter Conference on Applications of Computer Vision, Waikoloa Village, Jan 7-11, 2019. Piscataway: IEEE, 2019: 1262-1270.
[94]   TOKOZUME Y, USHIKU Y, HARADA T. Learning from between-class examples for deep sound recognition[J]. arXiv:1711.10282, 2017.
[95]   SHIMADA T, YAMAGUCHI S, HAYASHI K, et al. Data interpolating prediction: alternative interpretation of mixup[J]. arXiv:1906.08412, 2019.
[96]   OKI H, KURITA T. Mixup of feature maps in a hidden layer for training of convolutional neural network[C]//LNCS 11302: Proceedings of the 25th International Conference on Neural Information Processing, Siem Reap, Dec 13-16, 2018. Cham: Springer, 2018: 635-644.
[97]   VERMA V, LAMB A, BECKHAM C, et al. Manifold mixup: better representations by interpolating hidden states[J]. arXiv:1806.05236, 2018.
[98]   JADERBERG M, SIMONYAN K, ZISSERMAN A, et al. Spatial transformer networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Dec 7-12, 2015: 2017-2025.
[99]   TAKEKI A, IKAMI D, IRIE G, et al. Parallel grid pooling for data augmentation[J]. arXiv:1803.11370, 2018.
[100] LIU X, ZOU Y, KONG L, et al. Data augmentation via latent space interpolation for image classification[C]//Proceedings of the 24th International Conference on Pattern Recognition, Beijing, Aug 20-24, 2018. Piscataway: IEEE, 2018: 728-733.
[101] LI B, WU F, LIM S N, et al. On feature normalization and data augmentation[J]. arXiv:2002.11102, 2020.
[102] DEVRIES T, TAYLOR G W. Dataset augmentation in feature space[C]//Proceedings of the 5th International Conference on Learning Representations, Toulon, Apr 24-26, 2017.
[103] HAN D, LIU Q, FAN W. A new image classification method using CNN transfer learning and web data augmentation[J]. Expert Systems with Applications, 2018, 95: 43-56.
[104] LIANG X, HU Z, ZHANG H, et al. Recurrent topic-transition GAN for visual paragraph generation[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 3382-3391.
[105] MIRZA M, OSINDERO S. Conditional generative adversarial nets[J]. arXiv:1411.1784, 2014.
[106] MA C G, GUO Y Y, WU P, et al. Review of image enhancement based on generative adversarial networks[J]. Netinfo Security, 2019, 19(5): 10-21.
马春光, 郭瑶瑶, 武朋, 等. 生成式对抗网络图像增强研究综述[J]. 信息网络安全, 2019, 19(5): 10-21.
[107] ODENA A, OLAH C, SHLENS J. Conditional image synthesis with auxiliary classifier GANs[C]//Proceedings of the 34th International Conference on Machine Learning, Sydney, Aug 6-11, 2017: 2642-2651.
[108] MARIANI G, SCHEIDEGGER F, ISTRATE R, et al. BaGAN: data augmentation with balancing GAN[J]. arXiv:1803.09655, 2018.
[109] HUANG L, LIN K C J, TSENG Y C. Resolving intra-class imbalance for GAN-based image augmentation[C]//Proceedings of the 2019 IEEE International Conference on Multimedia and Expo, Shanghai, Jul 8-12, 2019. Piscataway: IEEE, 2019: 970-975.
[110] SINGH A, DUTTA D, SAHA A. MIGAN: malware image synthesis using GANs[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Honolulu, Jan 27-Feb 1, 2019. Menlo Park: AAAI, 2019: 10033-10034.
[111] LAKE B M, SALAKHUTDINOV R, TENENBAUM J B. Human-level concept learning through probabilistic program induction[J]. Science, 2015, 350(6266): 1332-1338.
[112] COHEN G, AFSHAR S, TAPSON J, et al. EMNIST: an extension of MNIST to handwritten letters[J]. arXiv:1702. 05373, 2017.
[113] LARSEN A B L, S?NDERBY S K, LAROCHELLE H, et al. Autoencoding beyond pixels using a learned similarity metric[J]. arXiv:1512.09300, 2015.
[114] DENTON E L, CHINTALA S, FERGUS R, et al. Deep generative image models using a Laplacian pyramid of adversarial networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal,?Dec 7-12, 2015: 1486-1494.
[115] RADFORD A, METZ L, CHINTALA S. Unsupervised representation learning with deep convolutional generative adversarial networks[J]. arXiv:1511.06434, 2015.
[116] ZHANG H, XU T, LI H, et al. StackGAN: text to photo- realistic image synthesis with stacked generative adversarial networks[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 5907-5915.
[117] ZHAO J, MATHIEU M, LECUN Y. Energy-based generative adversarial network[J]. arXiv:1609.03126, 2016.
[118] KARRAS T, AILA T, LAINE S, et al. Progressive growing of GANs for improved quality, stability, and variation[J]. arXiv:1710.10196, 2017.
[119] ZHANG H, GOODFELLOW I, METAXAS D, et al. Self-attention generative adversarial networks[J]. arXiv:1805. 08318, 2018.
[120] CHEN X, DUAN Y, HOUTHOOFT R, et al. Info-GAN: interpretable representation learning by information maximizing generative adversarial nets[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Dec 5-10, 2016: 2172-2180.
[121] ZHU J Y, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 2223-2232.
[122] BROCK A, DONAHUE J, SIMONYAN K. Large scale GAN training for high fidelity natural image synthesis[C]//Proceedings of the 7th International Conference on Learning Representations, New Orleans, May 6-9, 2019.
[123] ARJOVSKY M, CHINTALA S, BOTTOU L. Wasserstein GAN[J]. arXiv:1701.07875, 2017.
[124] GULRAJANI I, AHMED F, ARJOVSKY M, et al. Improved training of Wasserstein GANs[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, Dec 4-9, 2017: 5769-5779.
[125] MAO X, LI Q, XIE H, et al. Least squares generative adversarial networks[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 2794-2802.
[126] QI G J. Loss-sensitive generative adversarial networks on Lipschitz densities[J]. arXiv:1701.06264, 2017.
[127] NOWOZIN S, CSEKE B, TOMIOKA R. F-GAN: training generative neural samplers using variational divergence minimization[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Dec 5-10, 2016: 271-279.
[128] METZ L, POOLE B, PFAU D, et al. Unrolled generative adversarial networks[J]. arXiv:1611.02163, 2016.
[129] CHE T, LI Y, JACOB A P, et al. Mode regularized generative adversarial networks[J]. arXiv:1612.02136, 2016.
[130] JOLICOEUR-MARTINEAU A. The relativistic discriminator: a key element missing from standard GAN[J]. arXiv: 1807.00734, 2018.
[131] FRID-ADAR M, KLANG E, AMITAI M, et al. Synthetic data augmentation using GAN for improved liver lesion classification[C]//Proceedings of the 15th IEEE International Symposium on Biomedical Imaging, Washington, Apr 4-7, 2018. Piscataway: IEEE, 2018: 289-293.
[132] SHIN H C, TENENHOLTZ N A, ROGERS J K, et al. Medical image synthesis for data augmentation and anonymization using generative adversarial networks[C]//LNCS 11037: Proceedings of the 2018 International Workshop on Simulation and Synthesis in Medical Imaging, Granada, Sep 16, 2018. Cham: Springer, 2018: 1-11.
[133] HUANG R, XIE X, LAI J, et al. Conditional face synthesis for data augmentation[C]//LNCS 11258: Proceedings of the 2018 Chinese Conference on Pattern Recognition and Computer Vision, Guangzhou, Nov 23-26, 2018. Cham: Springer, 2018: 137-149.
[134] HAN C, MURAO K, SATOH S, et al. Learning more with less: GAN-based medical image augmentation[J]. Medical Imaging Technology, 2019, 37(3): 137-142.
[135] REDMON J, FARHADI A. YOLOv3: an incremental improvement[J]. arXiv:1804.02767, 2018.
[136] CISSE M, BOJANOWSKI P, GRAVE E, et al. Parseval networks: improving robustness to adversarial examples[C]//Proceedings of the 34th International Conference on Machine Learning, Sydney, Aug 6-11, 2017: 854-863.
[137] STOCK P, CISSE M. ConvNets and ImageNet beyond accuracy: understanding mistakes and uncovering biases[C]//LNCS 11210: Proceedings of the 2018 European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 498-512.
[138] SOHN K, LEE H, YAN X. Learning structured output representation using deep conditional generative models[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Dec 7-12, 2015: 3483-3491.
[139] JIANG Y, GONG X, LIU D, et al. EnlightenGAN: deep light enhancement without paired supervision[J]. arXiv: 1906.06972, 2019.
[140] HUANG H, YU P S, WANG C. An introduction to image synthesis with generative adversarial nets[J]. arXiv:1803. 04469, 2018.
[141] HOSHEN Y, LI K, MALIK J. Non-adversarial image synthesis with generative latent nearest neighbors[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 5811-5819.
[142] VASCONCELOS C N, VASCONCELOS B N. Increasing deep learning melanoma classification by classical and expert knowledge based image transforms[J]. arXiv:1702. 07025, 2017.
[143] CHEN C, LI W J, CHEN L, et al. An adaptive biomimetic image processing method: LDRF algorithm[J]. CAAI Transactions on Intelligent Systems, 2012, 7(5): 404-408.
谌琛, 李卫军, 陈亮, 等. 一种自适应的仿生图像增强方法: LDRF 算法[J]. 智能系统学报, 2012, 7(5): 404-408.
[144] LIU M, XIE Z, HUANG Y, et al. Distilling GRU with data augmentation for unconstrained handwritten text recognition[C]//Proceedings of the 2018 16th International Conference on Frontiers in Handwriting Recognition, Niagara Falls, Aug 5-8, 2018. Piscataway: IEEE, 2018: 56-61.
[145] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[146] LECUN Y, HUANG F J, BOTTOU L. Learning methods for generic object recognition with invariance to pose and lighting[C]//Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington, Jun 27-Jul 2, 2004. Washington: IEEE Computer Society, 2004: 97-104.
[147] PHILLIPS P J, MOON H, RIZVI S A, et al. The FERET evaluation methodology for face-recognition algorithms[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(10): 1090-1104.
[148] LIM S, KIM I, KIM T, et al. Fast autoaugment[J]. arXiv: 1905.00397, 2019.
[149] ZHANG J, WU Q, ZHANG J, et al. Mind your neighbours: image annotation with metadata neighbouhood graph co-attention networks[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 2956-2964.
[150] WEI K, YANG J, FU Y, et al. Single image reflection removal exploiting misaligned training data and network enhancements[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 8178-8187.
[151] YI K, WU J. Probabilistic end-to-end noise correction for learning with noisy labels[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 7017-7025.
[152] PARK J, LEE J Y, YOO D, et al. Distort-and-recover: color enhancement using deep reinforcement learning[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Piscataway: IEEE, 2018: 5928-5936.
[153] LIU B, WANG X, DIXIT M, et al. Feature space transfer for data augmentation[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Piscataway: IEEE, 2018: 9090-9098.
[154] CUBUK E D, ZOPH B, SHLENS J, et al. Randaugment: practical automated data augmentation with a reduced search space[J]. arXiv:1909.13719, 2019.