Diffusion models generate new samples by progressively decreasing the noise from the initially provided random distribution. This inference procedure generally utilizes a trained neural network numerous times to obtain the final output, creating significant latency and energy consumption on digital electronic hardware such as GPUs. In this study, we demonstrate that the propagation of a light beam through a semi-transparent medium can be programmed to implement a denoising diffusion model on image samples. This framework projects noisy image patterns through passive diffractive optical layers, which collectively only transmit the predicted noise term in the image. The optical transparent layers, which are trained with an online training approach, backpropagating the error to the analytical model of the system, are passive and kept the same across different steps of denoising. Hence this method enables high-speed image generation with minimal power consumption, benefiting from the bandwidth and energy efficiency of optical information processing.
翻译:扩散模型通过逐步降低初始随机分布中的噪声来生成新样本。该推理过程通常需要多次调用训练好的神经网络以获得最终输出,导致在GPU等数字电子硬件上产生显著的延迟和能耗。本研究证明,光束通过半透明介质的传播过程可被编程以实现对图像样本的去噪扩散模型。该框架将含噪图像模式投射通过被动衍射光学层,这些层共同仅传输图像中的预测噪声项。通过在线训练方法训练的光学透明层将误差反向传播至系统解析模型,这些层为被动元件且在去噪的不同步骤中保持不变。因此,该方法借助光学信息处理的带宽与能效优势,能够以极低功耗实现高速图像生成。