DiffBench 与 DiffAgent 相遇：端到端大语言模型驱动的扩散模型加速代码生成 (DiffBench Meets DiffAgent: End-to-End LLM-Driven Diffusion Acceleration Code Generation) - 专知论文

会员服务 ·

0

代码 · 模型加速 · 模型驱动 · 代码生成 · 扩散模型 ·

DiffBench Meets DiffAgent: End-to-End LLM-Driven Diffusion Acceleration Code Generation

翻译：DiffBench 与 DiffAgent 相遇：端到端大语言模型驱动的扩散模型加速代码生成

Jiajun jiao,Haowei Zhu,Puyuan Yang,Jianghui Wang,Ji Liu,Ziqiong Liu,Dong Li,Yuejian Fang,Junhai Yong,Bin Wang,Emad Barsoum

from arxiv, Accepted to AAAI 2026

Diffusion models have achieved remarkable success in image and video generation. However, their inherently multiple step inference process imposes substantial computational overhead, hindering real-world deployment. Accelerating diffusion models is therefore essential, yet determining how to combine multiple model acceleration techniques remains a significant challenge. To address this issue, we introduce a framework driven by large language models (LLMs) for automated acceleration code generation and evaluation. First, we present DiffBench, a comprehensive benchmark that implements a three stage automated evaluation pipeline across diverse diffusion architectures, optimization combinations and deployment scenarios. Second, we propose DiffAgent, an agent that generates optimal acceleration strategies and codes for arbitrary diffusion models. DiffAgent employs a closed-loop workflow in which a planning component and a debugging component iteratively refine the output of a code generation component, while a genetic algorithm extracts performance feedback from the execution environment to guide subsequent code refinements. We provide a detailed explanation of the DiffBench construction and the design principles underlying DiffAgent. Extensive experiments show that DiffBench offers a thorough evaluation of generated codes and that DiffAgent significantly outperforms existing LLMs in producing effective diffusion acceleration strategies.

翻译：扩散模型在图像与视频生成领域取得了显著成功。然而，其固有的多步推理过程带来了巨大的计算开销，阻碍了实际部署。因此，加速扩散模型至关重要，但如何组合多种模型加速技术仍然是一个重大挑战。为解决此问题，我们引入了一个由大语言模型驱动的框架，用于自动化加速代码生成与评估。首先，我们提出了 DiffBench，这是一个全面的基准测试，实现了涵盖多种扩散模型架构、优化组合及部署场景的三阶段自动化评估流程。其次，我们提出了 DiffAgent，一个能为任意扩散模型生成最优加速策略与代码的智能体。DiffAgent 采用闭环工作流，其中规划组件与调试组件迭代优化代码生成组件的输出，同时遗传算法从执行环境中提取性能反馈以指导后续的代码改进。我们详细阐述了 DiffBench 的构建过程以及 DiffAgent 背后的设计原理。大量实验表明，DiffBench 能对生成的代码进行全面评估，并且 DiffAgent 在生成有效的扩散模型加速策略方面显著优于现有的大语言模型。

0

相关内容

代码（Code）是专知网的一个重要知识资料文档板块，旨在整理收录论文源代码、复现代码，经典工程代码等，便于用户查阅下载使用。

扩散模型中的缓存方法综述：迈向高效的多模态生成

扩散模型中的缓存方法综述：迈向高效的多模态生成

专知会员服务

8+阅读 · 2025年10月23日

用于语言生成的离散扩散模型

用于语言生成的离散扩散模型

专知会员服务

11+阅读 · 2025年7月10日

扩散模型中的 Transformer：图像生成及其延展应用询问 ChatGPT

扩散模型中的 Transformer：图像生成及其延展应用询问 ChatGPT

专知会员服务

11+阅读 · 2025年7月5日

《大型语言模型加速生成技术》最新综述

《大型语言模型加速生成技术》最新综述

专知会员服务

50+阅读 · 2024年5月25日

【CVPR2024】DistriFusion: 高分辨率扩散模型的分布式并行推理

【CVPR2024】DistriFusion: 高分辨率扩散模型的分布式并行推理

专知会员服务

22+阅读 · 2024年3月1日

生成式AI时代的模型压缩与加速，韩松主讲MIT课程，资料全公开

生成式AI时代的模型压缩与加速，韩松主讲MIT课程，资料全公开

专知会员服务

35+阅读 · 2023年9月25日

英伟达斯坦福CVPR2023等最新《去噪扩散模型：生成学习的大爆炸》教程，附300多页ppt

英伟达斯坦福CVPR2023等最新《去噪扩散模型：生成学习的大爆炸》教程，附300多页ppt

专知会员服务

54+阅读 · 2023年6月27日

NLP+Diffusion=？UMN最新《NLP中的扩散模型》综述，全面阐述离散和嵌入扩散模型方法

NLP+Diffusion=？UMN最新《NLP中的扩散模型》综述，全面阐述离散和嵌入扩散模型方法

专知会员服务

54+阅读 · 2023年5月26日

【CVPR2023】DiffCollage:用扩散模型并行生成大量内容

【CVPR2023】DiffCollage:用扩散模型并行生成大量内容

专知会员服务

28+阅读 · 2023年4月4日

详解扩散模型：从DDPM到稳定扩散，附Slides与视频

详解扩散模型：从DDPM到稳定扩散，附Slides与视频

专知会员服务

87+阅读 · 2022年10月9日

港科大浙大最新《深度生成模型三维表示》综述，20页pdf全面阐述3D生成进展

港科大浙大最新《深度生成模型三维表示》综述，20页pdf全面阐述3D生成进展

专知

12+阅读 · 2022年10月31日

PyTorch 深度剖析：如何使用模型并行技术（Model Parallel）

PyTorch 深度剖析：如何使用模型并行技术（Model Parallel）

极市平台

11+阅读 · 2021年11月18日

最新《深度生成式模型进展》视频报告，43页ppt，斯坦福Aditya Grover

最新《深度生成式模型进展》视频报告，43页ppt，斯坦福Aditya Grover

专知

13+阅读 · 2020年8月9日

一网打尽！深度学习100+经典模型TensorFlow与Pytorch代码实现大集合

一网打尽！深度学习100+经典模型TensorFlow与Pytorch代码实现大集合

专知

34+阅读 · 2020年1月3日

[Google]BERT压缩到7MB！最新基于最优子词和共享投影的极限语言压缩模型

[Google]BERT压缩到7MB！最新基于最优子词和共享投影的极限语言压缩模型

专知

31+阅读 · 2019年10月6日

语义鸿沟、异构鸿沟、数据缺失，多模态技术如何跨过这些坎？

语义鸿沟、异构鸿沟、数据缺失，多模态技术如何跨过这些坎？

AI前线

15+阅读 · 2019年3月21日

【最新综述】模型压缩与加速（附论文全文下载）

【最新综述】模型压缩与加速（附论文全文下载）

专知

28+阅读 · 2019年2月14日

一句代码发布你的TensorFlow模型，简明TensorFlow Serving上手教程

一句代码发布你的TensorFlow模型，简明TensorFlow Serving上手教程

专知

13+阅读 · 2018年11月30日

超全总结：神经网络加速之量化模型 | 附带代码

超全总结：神经网络加速之量化模型 | 附带代码

PaperWeekly

12+阅读 · 2018年6月1日

【论文】所见所想所真，对抗学习GAN提升跨模态检索效果！阿里巴巴AI Labs等团队最新工作

【论文】所见所想所真，对抗学习GAN提升跨模态检索效果！阿里巴巴AI Labs等团队最新工作

专知

12+阅读 · 2017年12月21日

基于智能模糊测试的深度漏洞挖掘技术研究

国家自然科学基金

4+阅读 · 2017年12月31日

高性能视频云转码服务的优化机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于内容分析的低复杂度高效视频编码方法

国家自然科学基金

0+阅读 · 2015年12月31日

量子算法加速性差异研究及其应用

国家自然科学基金

0+阅读 · 2015年12月31日

有效融合多源异构数据的集成分类器研究

国家自然科学基金

5+阅读 · 2015年12月31日

云计算下的加密域多媒体水印与模式匹配

国家自然科学基金

1+阅读 · 2015年12月31日

面向无线异构网络中多媒体信息组播的多速率网络编码理论和应用研究

国家自然科学基金

0+阅读 · 2015年12月31日

面向存储受限应用的GPU性能预测模型和通信优化关键技术研究

国家自然科学基金

2+阅读 · 2015年12月31日

扩散过程离散化形式下的若干统计问题的大偏差原理

国家自然科学基金

0+阅读 · 2014年12月31日

高维数据下的模型平均方法

国家自然科学基金

6+阅读 · 2014年12月31日

DisCa: Accelerating Video Diffusion Transformers with Distillation-Compatible Learnable Feature Caching

Arxiv

0+阅读 · 2月5日

TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows

Arxiv

0+阅读 · 1月28日

DART: Diffusion-Inspired Speculative Decoding for Fast LLM Inference

Arxiv

0+阅读 · 1月27日

Streaming-dLLM: Accelerating Diffusion LLMs via Suffix Pruning and Dynamic Decoding

Arxiv

0+阅读 · 1月27日

treaming-dLLM: Accelerating Diffusion LLMs via Suffix Pruning and Dynamic Decoding

Arxiv

0+阅读 · 1月25日

Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model

Arxiv

0+阅读 · 1月23日

DiffusionAgent: Navigating Expert Models for Agentic Image Generation

Arxiv

0+阅读 · 1月20日

Lightning Fast Caching-based Parallel Denoising Prediction for Accelerating Talking Head Generation

Arxiv

0+阅读 · 1月19日

Think-Then-Generate: Reasoning-Aware Text-to-Image Diffusion with LLM Encoders

Arxiv

0+阅读 · 1月15日

Spiffy: Multiplying Diffusion LLM Acceleration via Lossless Speculative Decoding

Arxiv

0+阅读 · 1月14日

VIP会员

文章信息

相关主题

相关VIP内容

扩散模型中的缓存方法综述：迈向高效的多模态生成

扩散模型中的缓存方法综述：迈向高效的多模态生成

专知会员服务

8+阅读 · 2025年10月23日

用于语言生成的离散扩散模型

用于语言生成的离散扩散模型

专知会员服务

11+阅读 · 2025年7月10日

扩散模型中的 Transformer：图像生成及其延展应用询问 ChatGPT

扩散模型中的 Transformer：图像生成及其延展应用询问 ChatGPT

专知会员服务

11+阅读 · 2025年7月5日

《大型语言模型加速生成技术》最新综述

《大型语言模型加速生成技术》最新综述

专知会员服务

50+阅读 · 2024年5月25日

【CVPR2024】DistriFusion: 高分辨率扩散模型的分布式并行推理

【CVPR2024】DistriFusion: 高分辨率扩散模型的分布式并行推理

专知会员服务

22+阅读 · 2024年3月1日

生成式AI时代的模型压缩与加速，韩松主讲MIT课程，资料全公开

生成式AI时代的模型压缩与加速，韩松主讲MIT课程，资料全公开

专知会员服务

35+阅读 · 2023年9月25日

英伟达斯坦福CVPR2023等最新《去噪扩散模型：生成学习的大爆炸》教程，附300多页ppt

英伟达斯坦福CVPR2023等最新《去噪扩散模型：生成学习的大爆炸》教程，附300多页ppt

专知会员服务

54+阅读 · 2023年6月27日

NLP+Diffusion=？UMN最新《NLP中的扩散模型》综述，全面阐述离散和嵌入扩散模型方法

NLP+Diffusion=？UMN最新《NLP中的扩散模型》综述，全面阐述离散和嵌入扩散模型方法

专知会员服务

54+阅读 · 2023年5月26日

【CVPR2023】DiffCollage:用扩散模型并行生成大量内容

【CVPR2023】DiffCollage:用扩散模型并行生成大量内容

专知会员服务

28+阅读 · 2023年4月4日

详解扩散模型：从DDPM到稳定扩散，附Slides与视频

详解扩散模型：从DDPM到稳定扩散，附Slides与视频

专知会员服务

87+阅读 · 2022年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

论学习、公平性与复杂度

《整合杀伤链：一个用于边缘目标验证与战术推理的零样本框架》最新资料

2025中国人工智能学会系列白皮书⸺棋盘上的人工智能|附下载

通用智能体评估的逻辑架构

相关资讯

港科大浙大最新《深度生成模型三维表示》综述，20页pdf全面阐述3D生成进展

港科大浙大最新《深度生成模型三维表示》综述，20页pdf全面阐述3D生成进展

专知

12+阅读 · 2022年10月31日

PyTorch 深度剖析：如何使用模型并行技术（Model Parallel）

PyTorch 深度剖析：如何使用模型并行技术（Model Parallel）

极市平台

11+阅读 · 2021年11月18日

最新《深度生成式模型进展》视频报告，43页ppt，斯坦福Aditya Grover

最新《深度生成式模型进展》视频报告，43页ppt，斯坦福Aditya Grover

专知

13+阅读 · 2020年8月9日

一网打尽！深度学习100+经典模型TensorFlow与Pytorch代码实现大集合

一网打尽！深度学习100+经典模型TensorFlow与Pytorch代码实现大集合

专知

34+阅读 · 2020年1月3日

[Google]BERT压缩到7MB！最新基于最优子词和共享投影的极限语言压缩模型

[Google]BERT压缩到7MB！最新基于最优子词和共享投影的极限语言压缩模型

专知

31+阅读 · 2019年10月6日

语义鸿沟、异构鸿沟、数据缺失，多模态技术如何跨过这些坎？

语义鸿沟、异构鸿沟、数据缺失，多模态技术如何跨过这些坎？

AI前线

15+阅读 · 2019年3月21日

【最新综述】模型压缩与加速（附论文全文下载）

【最新综述】模型压缩与加速（附论文全文下载）

专知

28+阅读 · 2019年2月14日

一句代码发布你的TensorFlow模型，简明TensorFlow Serving上手教程

一句代码发布你的TensorFlow模型，简明TensorFlow Serving上手教程

专知

13+阅读 · 2018年11月30日

超全总结：神经网络加速之量化模型 | 附带代码

超全总结：神经网络加速之量化模型 | 附带代码

PaperWeekly

12+阅读 · 2018年6月1日

【论文】所见所想所真，对抗学习GAN提升跨模态检索效果！阿里巴巴AI Labs等团队最新工作

【论文】所见所想所真，对抗学习GAN提升跨模态检索效果！阿里巴巴AI Labs等团队最新工作

专知

12+阅读 · 2017年12月21日

相关论文

DisCa: Accelerating Video Diffusion Transformers with Distillation-Compatible Learnable Feature Caching

Arxiv

0+阅读 · 2月5日

TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows

Arxiv

0+阅读 · 1月28日

DART: Diffusion-Inspired Speculative Decoding for Fast LLM Inference

Arxiv

0+阅读 · 1月27日

Streaming-dLLM: Accelerating Diffusion LLMs via Suffix Pruning and Dynamic Decoding

Arxiv

0+阅读 · 1月27日

treaming-dLLM: Accelerating Diffusion LLMs via Suffix Pruning and Dynamic Decoding

Arxiv

0+阅读 · 1月25日

Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model

Arxiv

0+阅读 · 1月23日

DiffusionAgent: Navigating Expert Models for Agentic Image Generation

Arxiv

0+阅读 · 1月20日

Lightning Fast Caching-based Parallel Denoising Prediction for Accelerating Talking Head Generation

Arxiv

0+阅读 · 1月19日

Think-Then-Generate: Reasoning-Aware Text-to-Image Diffusion with LLM Encoders

Arxiv

0+阅读 · 1月15日

Spiffy: Multiplying Diffusion LLM Acceleration via Lossless Speculative Decoding

Arxiv

0+阅读 · 1月14日

相关基金

基于智能模糊测试的深度漏洞挖掘技术研究

国家自然科学基金

4+阅读 · 2017年12月31日

高性能视频云转码服务的优化机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于内容分析的低复杂度高效视频编码方法

国家自然科学基金

0+阅读 · 2015年12月31日

量子算法加速性差异研究及其应用

国家自然科学基金

0+阅读 · 2015年12月31日

有效融合多源异构数据的集成分类器研究

国家自然科学基金

5+阅读 · 2015年12月31日

云计算下的加密域多媒体水印与模式匹配

国家自然科学基金

1+阅读 · 2015年12月31日

面向无线异构网络中多媒体信息组播的多速率网络编码理论和应用研究

国家自然科学基金

0+阅读 · 2015年12月31日

面向存储受限应用的GPU性能预测模型和通信优化关键技术研究

国家自然科学基金

2+阅读 · 2015年12月31日

扩散过程离散化形式下的若干统计问题的大偏差原理

国家自然科学基金

0+阅读 · 2014年12月31日

高维数据下的模型平均方法

国家自然科学基金

6+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员