Multimodal Information Fusion-Guided Graphical Interface Code Generation Framework for OpenFOAM

doi:10.3778/j.issn.1673-9418.2508051

Abstract

Abstract: Addressing the dual challenges of OpenFOAM?s high learning curve due to its reliance on command-line operations and the generally long development cycles and high customization costs of traditional simulation interfaces, this paper proposes the AutoCode4OF framework, which aims to achieve end-to-end automatic generation of a complete OpenFOAM executable interface code from multimodal inputs. The main innovations of the framework include: (1) At the input level, it integrates multimodal information such as images, natural language text, and existing code snippets to jointly represent user intent; (2) In terms of knowledge processing, by constructing a professional knowledge graph in the field of computational fluid dynamics (CFD) and introducing retrieval-augmented generation (RAG) along with dual front-end and back-end validation mechanisms, it significantly enhances the physical rationality and reliability of the generated code; (3) In system architecture design, it adopts a multi-agent collaborative working mechanism, decomposing the overall task into specialized modules such as knowledge retrieval, task planning, code generation, material setup, and testing verification, with each module collaborating to ensure the quality and completeness of the output. Experimental results show that AutoCode4OF achieves scores of 0.956, 0.997, and 100% in code quality, functional completeness, and compilation success rate, respectively. It demonstrates strong applicability and stability across various stages, including mesh generation, solution computation, and post-processing result validation, highlighting its high practical engineering application value and providing new insights for the intelligent development of scientific computing software.

Key words: OpenFOAM, multimodal large language model, knowledge graph, multi-agent systems, code generation

摘要： 针对OpenFOAM因依赖命令行操作而导致的使用门槛高，以及传统仿真界面普遍存在开发周期长、定制成本高的多重挑战，提出AutoCode4OF框架，旨在实现从多模态输入到完整OpenFOAM可执行界面代码的端到端自动生成。该框架主要创新包括：（1）在输入层面，框架融合图像、自然语言文本以及已有代码片段等多模态信息，联合表征用户意图；（2）在知识处理层面，通过构建计算流体力学（CFD）领域的专业知识图谱，并引入检索增强生成（RAG）与前后端双重验证机制，显著提升生成代码在物理意义上的合理性与可靠性；（3）在系统架构设计上，采用多智能体协同工作机制，将整体任务分解为知识检索、任务规划、代码生成、素材设置以及测试验证等多个专业模块，各模块分工协作，共同保障输出结果的质量与完整性。实验结果表明，AutoCode4OF在代码质量、功能完整性和编译成功率方面分别达到0.956、0.997和100%，在网格生成、求解计算以及后处理结果验证等多个环节中均展现出良好的适用性与稳定性，具备较高的实际工程应用价值，并为科学计算软件的智能化发展提供了新思路。

关键词: OpenFOAM, 多模态大模型, 知识图谱, 多智能体系统, 代码生成

LU Bin, LIU Jianfeng, WANG Haolin, ZHANG Yuzhi, CHEN Rui. Multimodal Information Fusion-Guided Graphical Interface Code Generation Framework for OpenFOAM[J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(12): 3243-3256.

卢斌, 刘建峰, 王浩琳, 张玉志, 陈锐. 多模态信息融合指导的OpenFOAM图形界面代码生成框架[J]. 计算机科学与探索, 2025, 19(12): 3243-3256.

References

[1] DURAIVELU K. Digital transformation in manufacturing industry-a comprehensive insight[J]. Materials Today: Proceed- ings, 2022, 68: 1825-1829.
[2] CHITALOV D I. Development of an application with a graphical user interface (GUI) to compute in parallel in the OpenFOAM environment[J]. Journal of Physics: Conference Series, 2019, 1399(3): 033001.
[3] ADAK R. OpenFOAM GUI development using Python on Blender[R]. Mumbai: IIT Bombay, 2021.
[4] WANG J X, CHEN Y X. A review on code generation with LLMs: application and evaluation[C]//Proceedings of the 2023 IEEE International Conference on Medical Artificial Intelligence. Piscataway: IEEE, 2024: 284-289.
[5] 顾斌, 于波, 董晓刚, 等. 程序智能合成技术研究进展[J]. 软件学报, 2021, 32(5): 1373-1384.
GU B, YU B, DONG X G, et al. Intelligent program synthesis techniques: literature review[J]. Journal of Software, 2021, 32(5): 1373-1384.
[6] 杨泽洲, 陈思榕, 高翠芸, 等. 基于深度学习的代码生成方法研究进展[J]. 软件学报, 2024, 35(2): 604-628.
YANG Z Z, CHEN S R, GAO C Y, et al. Deep learning based code generation methods: literature review[J]. Journal of Software, 2024, 35(2): 604-628.
[7] WANG Y, WANG W S, JOTY S, et al. CodeT5: identifier-aware unified pre-trained encoder-decoder models for code understanding and generation[EB/OL]. [2025-08-10]. https:// arxiv.org/abs/2109.00859.
[8] NIJKAMP E, PANG B, HAYASHI H, et al. CodeGen: an open large language model for code with multi-turn program synthesis[EB/OL]. [2025-08-10]. https://arxiv.org/abs/2203.13474.
[9] MO W J, LIU Q, WEN X F, et al. RedCoder: automated multi-turn red teaming for code LLMs[EB/OL]. [2025-08-10]. https://arxiv.org/abs/2507.22063.
[10] DONG Y H, JIANG X, QIAN J R, et al. A survey on code generation with LLM-based agents[EB/OL]. [2025-08-10]. https://arxiv.org/abs/2508.00083.
[11] FENG Z Y, GUO D Y, TANG D Y, et al. CodeBERT: a pre-trained model for programming and natural languages[EB/OL]. [2025-08-10]. https://arxiv.org/abs/2002.08155.
[12] LI Y J, CHOI D, CHUNG J, et al. Competition-level code generation with AlphaCode[J]. Science, 2022, 378(6624): 1092-1097.
[13] FRIED D, AGHAJANYAN A, LIN J, et al. InCoder: a generative model for code infilling and synthesis[EB/OL]. [2025-08-10]. https://arxiv.org/abs/2204.05999.
[14] LI R, ALLAL L B, ZI Y T, et al. StarCoder: may the source be with you![EB/OL]. [2025-08-10]. https://arxiv.org/abs/2305. 06161.
[15] BELTRAMELLI T. pix2code: generating code from a graphical user interface screenshot[C]//Proceedings of the ACM SIGCHI Symposium on Engineering Interactive Computing Systems. New York: ACM, 2018: 1-6.
[16] TENG Z W, FU Q C, WHITE J, et al. Sketch2Vis: generating data visualizations from hand-drawn sketches with deep learning[C]//Proceedings of the 2021 20th IEEE International Conference on Machine Learning and Applications. Piscataway: IEEE, 2022: 853-858.
[17] GUI Y, WAN Y, LI Z, et al. UICopilot: automating UI synthesis via hierarchical code generation from webpage designs [C]//Proceedings of the ACM on Web Conference 2025. New York: ACM, 2025: 1846-1855.
[18] XU Y, BO L L, SUN X B, et al. image2emmet: automatic code generation from web user interface image[J]. Journal of Software: Evolution and Process, 2021, 33(8): e2369.
[19] BALDWIN T, BHAT M, DENG M K, et al. Web2Code: a large-scale webpage-to-code dataset and evaluation framework for multimodal LLMs[C]//Advances in Neural Information Processing Systems 37, 2024: 112134-112157.
[20] HURST A, LERER A, GOUCHER A P, et al. GPT-4o system card[EB/OL]. [2025-08-10]. https://arxiv.org/abs/2410. 21276.
[21] DeepSeek-AI, GUO D Y, YANG D J, et al. DeepSeek-R1: incentivizing reasoning capability in LLMs via reinforcement learning[EB/OL]. [2025-08-10]. https://arxiv.org/abs/2501.12948.
[22] ZHANG K C, LI J, LI G, et al. CodeAgent: enhancing code generation with tool-integrated agent systems for real-world repo-level coding challenges[EB/OL]. [2025-08-10]. https://arxiv.org/abs/2401.07339.
[23] JIMENEZ C, LIERET K, NARASIMHAN K, et al. SWE-agent: agent-computer interfaces enable automated software engineering[C]//Advances in Neural Information Processing Systems 37, 2024: 50528-50652.
[24] WANG X Y, CHEN Y Y, YUAN L F, et al. Executable code actions elicit better LLM agents[C]//Proceedings of the 41st International Conference on Machine Learning. New York: ACM, 2024: 50208-50232.
[25] YUE L, SOMASEKHARAN N, CAO Y D, et al. Foam-Agent: towards automated intelligent CFD workflows[EB/OL]. [2025-08-10]. https://arxiv.org/abs/2505.04997.
[26] SOMAN K, ROSE P W, MORRIS J H, et al. Biomedical knowledge graph-optimized prompt generation for large language models[J]. Bioinformatics, 2024, 40(9): btae560.
[27] GUO D Y, REN S, LU S, et al. GraphCodeBERT: pre-training code representations with data flow[EB/OL]. [2025-08-10]. https://arxiv.org/abs/2009.08366.
[28] BALTRU?AITIS T, AHUJA C, MORENCY L P. Multimodal machine learning: a survey and taxonomy[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(2): 423-443.
[29] PENG C Y, XIA F, NASERIPARSA M, et al. Knowledge graphs: opportunities and challenges[J]. Artificial Intelligence Review, 2023, 56(11): 13071-13102.
[30] EHRLINGER L, WOSS W. Towards a definition of knowledge graphs[C]//Joint Proceedings of the Posters and Demos Track of the 12th International Conference on Semantic Systems and the 1st International Workshop on Semantic Change & Evolving Semantics co-located with the 12th International Conference on Semantic Systems, 2016.
[31] BROWN T B, MANN B, RYDER N, et al. Language models are few-shot learners[C]//Proceedings of the 34th International Conference on Neural Information Processing Systems. New York: ACM, 2020: 1877-1901.
[32] LIU P F, YUAN W Z, FU J L, et al. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing[J]. ACM Computing Surveys, 2023, 55(9): 1-35.
[33] LEWIS P, PEREZ E, PIKTUS A, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks[EB/OL]. [2025-08-10]. https://arxiv.org/abs/2005.11401.
[34] DORRI A, KANHERE S S, JURDAK R. Multi-agent systems: a survey[J]. IEEE Access, 2018, 6: 28573-28593.
[35] LI X Y, WANG S, ZENG S Q, et al. A survey on LLM-based multi-agent systems: workflow, infrastructure, and challenges[J]. Vicinagearth, 2024, 1(1): 9.
[36] HE J D, TREUDE C, LO D. LLM-based multi-agent systems for software engineering: literature review, vision, and the road ahead[J]. ACM Transactions on Software Engineering and Methodology, 2025, 34(5): 1-30.
[37] XI Z H, CHEN W X, GUO X, et al. The rise and potential of large language model based agents: a survey[J]. Science China Information Sciences, 2025, 68(2): 121101.
[38] OMAN P, HAGEMEISTER J. Metrics for assessing a software system’s maintainability[C]//Proceedings of the Conference on Software Maintenance. Piscataway: IEEE, 2002: 337-344.
[39] LISO A. Software maintainability metrics model: an improvement in the Coleman-Oman model[J]. Crosstalk, 2001: 15-17.
[40] BUSE R P L, WEIMER W R. A metric for software readability[C]//Proceedings of the 2008 International Symposium on Software Testing and Analysis. New York: ACM, 2008: 121-130.
[41] BUSE R P L, WEIMER W R. Learning a metric for code readability[J]. IEEE Transactions on Software Engineering, 2010, 36(4): 546-558.
[42] Systems and software engineering - systems and software quality requirements and evaluation (SQuaRE) -system and software quality models: ISO/IEC 25010[S]. ISO, 2011.