Journal of Frontiers of Computer Science and Technology

• Science Researches •     Next Articles

Survey of AI Painting

ZHANG Zeyu, WANG Tiejun, GUO Xiaoran, LONG Zhilei, XU Kui   

  1. 1.Key Laboratory of China´s Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, Lanzhou 730030, China
    2.School of Mathematics and Computer Science,Northwest Minzu University, Lanzhou 730030, China

AI绘画研究综述

张泽宇,王铁君,郭晓然,龙智磊,徐魁   

  1. 1.西北民族大学 中国民族语言文字信息技术教育部重点实验室,兰州 730030
    2.西北民族大学 数学与计算机科学学院,兰州 730030

Abstract: AI painting is pushing boundaries in art, media, design, and education with natural language processing, large pre-trained models, and diffusion models. It delves into Image-to-Image and Text-to-Image tasks, analyzing representative models and their key technologies. For Image-to-Image, it examines the evolution, principles, and pros/cons of models based on autoencoders and GANs, summarizing their dataset performances. For Text-to-Image, it reviews structural differences among models, especially diffusion models, noting their rise to prominence and hinting at a diverse future for image generation. It also compares mainstream AI painting platforms in terms of usability and speed. Lastly, it discusses AI painting’s technical and societal challenges, anticipating trends like artist-AI collaboration, interactive painting processes, and the emergence of new professions and industries.

Key words: AI Painting, Image-to-Image, Text-to-Image, image generation, artificial intelligence generated content

摘要: AI绘画,作为计算机视觉领域的热门研究方向,正通过自然语言处理技术、图文预训练大模型,以及新兴的扩散模型,不断拓展其在艺术创作、影视媒体、工业设计、艺术教育等领域的应用边界。将以图生图和以文生图两类AI绘画任务为主线,深入分析了代表性模型及其关键技术和方法。对于以图生图方式,从基于自编码器和基于生成式对抗网络两类模型分别探讨了各自的发展脉络、生成原理以及优缺点,并总结了它们在公共数据集上的效果;对于以文生图方式,归纳了基于扩散模型等三类模型的结构区别,以及在三个数据集上各类模型的生成效果,同时指出利用扩散模型的以文生图方式已成为当下的热点,并预示着未来图像生成方式的多样化发展。并且对目前主流的AI绘画平台从使用方式、生成速度等角度进行了对比总结。最后在总结AI绘画在技术层面和社会层面所面临的问题与争议的基础上,展望了AI绘画与人类艺术家的互补发展、绘画过程互动性增强、以及新职业和产业的出现等未来趋势。

关键词: AI绘画, 以图生图, 以文生图, 图像生成, AIGC