Journal of Frontiers of Computer Science and Technology ›› 2026, Vol. 20 ›› Issue (2): 301-325.DOI: 10.3778/j.issn.1673-9418.2506009

• Frontiers·Surveys • Previous Articles     Next Articles

Review of EEG-Based Multimodal Content Generation Techniques

LIN Chengde1,2,3, YANG Mingzhe2, MO Chengjun2, LI Guohui3+   

  1. 1. College of Intelligence and Computing, Tianjin University, Tianjin 300350, China
    2. School of Artificial Intelligence, Guilin University of Electronic Technology, Guilin, Guangxi 541004, China
    3. Tianjin Development Zone Orking High Tech. Co., Ltd., Tianjin 300457, China
    + Corresponding author  E-mail: orking2024@163.com
  • Received:2025-06-05 Revised:2025-10-13 Online:2026-02-01 Published:2026-02-01
  • Supported by:
    This work was supported by the China Postdoctoral Science Foundation (2025M781503), the Key Research and Development Program of Guangxi (Guike AB25069017), the Scientific Research Capacity Enhancement Project for Young and Middle-aged Teachers in Higher Education Institutions of Guangxi (2025KY0253), the Science and Technology Research and Development Program of Guilin City (20230120-3), the Innovation Project of Guangxi Graduate Education (YCSW2025360), and the National College Students’ Innovation and Entrepreneurship Training Program (202410595021).

基于脑电信号的多模态内容生成技术研究综述

林承德1,2,3,杨铭哲2,莫程俊2,李国翚3+   

  1. 1. 天津大学 智能与计算学部,天津 300350
    2. 桂林电子科技大学 人工智能学院,广西 桂林 541004
    3. 天津开发区奥金高新技术有限公司,天津 300457
    + 通信作者  E-mail: orking2024@163.com
  • 基金资助:
    中国博士后科学基金面上项目(2025M781503);广西重点研发计划项目(桂科AB25069017);广西高校中青年教师科研基础能力提升项目(2025KY0253);桂林市科学研究与技术开发计划(20230120-3);广西研究生教育创新计划项目(YCSW2025360);国家级大学生创新创业项目(202410595021)。

Abstract: Electroencephalogram (EEG)-based multimodal content generation is an emerging research direction in brain-computer interface (BCI) and artificial intelligence, aiming to reconstruct multimodal content by decoding EEG signals. This technology provides a novel paradigm for brain function analysis and the development of interactive systems. This review first outlines the characteristics of EEG signals and categorizes EEG-based tasks in artificial intelligence. To address inherent challenges such as low signal-to-noise ratio and limited spatial resolution, the development of preprocessing techniques is systematically reviewed, focusing on three key methods: denoising, augmentation, and super-resolution. The advantages and limitations of traditional approaches and deep learning models in improving EEG data quality and usability are compared. For feature extraction, the review comparatively analyzes traditional time-frequency-spatial methods and deep learning models, summarizes the latest technical pathways, and assesses their impact on decoding accuracy. In multimodal content generation, the research progress in image, audio, text, and video generation is covered by modality. The main network architectures and methods currently employed are analyzed, along with a discussion of the challenges faced by the technology. The current state of research and existing problems are summarized, and future development trends are presented.

Key words: EEG signal decoding, Deep learning, Multimodal content generation, Brain-computer interface

摘要: 基于脑电信号(EEG)的多模态内容生成技术是脑机接口(BCI)与人工智能领域的新兴研究方向,旨在通过解码EEG实现多模态内容的重建,为脑功能解析与交互系统构建提供全新范式。技术融合脑电信号预处理、特征提取及跨模态生成等核心环节,近年来在方法创新与应用探索上取得显著进展。概述了脑电信号的特点,分类介绍了人工智能领域内基于脑电信号的任务。围绕脑电信号低信噪比、空间分辨率受限等固有问题,系统梳理预处理技术的发展脉络,分为去噪、增广、超分三大关键方法,系统比较了传统方法与深度模型在提升脑电信号数据质量与可用性方面的优势及局限;在特征提取方面,对比分析传统时频空域方法与深度学习模型的优势,总结最新的技术路径,分析其对解码精度的影响;在多模态内容生成任务中,按照模态分类,详细归纳了相关技术在图像、音频、文本、视频生成领域的研究进展,重点剖析了目前采用的主要网络结构与方法,客观论证了当前技术面临的挑战。总结现有技术的研究情况和存在的问题,并对未来的发展趋势进行了展望。

关键词: 脑电信号解码, 深度学习, 多模态内容生成, 脑机接口