Diffusion models generate new samples by progressively decreasing the noise from the initially provided random distribution. This inference procedure generally utilizes a trained neural network numerous times to obtain the final output, creating significant latency and energy consumption on digital electronic hardware such as GPUs. In this study, we demonstrate that the propagation of a light beam through a semi-transparent medium can be programmed to implement a denoising diffusion model on image samples. This framework projects noisy image patterns through passive diffractive optical layers, which collectively only transmit the predicted noise term in the image. The optical transparent layers, which are trained with an online training approach, backpropagating the error to the analytical model of the system, are passive and kept the same across different steps of denoising. Hence this method enables high-speed image generation with minimal power consumption, benefiting from the bandwidth and energy efficiency of optical information processing.
翻译:扩散模型通过逐步降低初始随机分布中的噪声来生成新样本。该推理过程通常需要多次调用训练好的神经网络以获得最终输出,这在GPU等数字电子硬件上会产生显著的延迟和能耗。在本研究中,我们证明光束通过半透明介质的传播过程可被编程以实现对图像样本的去噪扩散模型。该框架将含噪图像模式投射通过被动的衍射光学层,这些光学层共同仅透射图像中的预测噪声项。通过在线训练方法训练的光学透明层——将误差反向传播至系统的解析模型——是被动元件,并在不同去噪步骤中保持不变。因此,该方法能够以极低功耗实现高速图像生成,得益于光学信息处理的高带宽与高能效特性。