Subsampling is commonly used to mitigate costs associated with data acquisition, such as time or energy requirements, motivating the development of algorithms for estimating the fully-sampled signal of interest $x$ from partially observed measurements $y$. In maximum-entropy sampling, one selects measurement locations that are expected to have the highest entropy, so as to minimize uncertainty about $x$. This approach relies on an accurate model of the posterior distribution over future measurements, given the measurements observed so far. Recently, diffusion models have been shown to produce high-quality posterior samples of high-dimensional signals using guided diffusion. In this work, we propose Active Diffusion Subsampling (ADS), a method for performing active subsampling using guided diffusion in which the model tracks a distribution of beliefs over the true state of $x$ throughout the reverse diffusion process, progressively decreasing its uncertainty by choosing to acquire measurements with maximum expected entropy, and ultimately generating the posterior distribution $p(x | y)$. ADS can be applied using pre-trained diffusion models for any subsampling rate, and does not require task-specific retraining - just the specification of a measurement model. Furthermore, the maximum entropy sampling policy employed by ADS is interpretable, enhancing transparency relative to existing methods using black-box policies. Experimentally, we show that ADS outperforms fixed sampling strategies, and study an application of ADS in Magnetic Resonance Imaging acceleration using the fastMRI dataset, finding that ADS performs competitively with supervised methods. Code available at https://active-diffusion-subsampling.github.io/.
翻译:子采样常用于缓解与数据采集相关的成本,如时间或能源需求,这推动了从部分观测测量值 $y$ 估计完全采样目标信号 $x$ 的算法发展。在最大熵采样中,人们选择预期具有最高熵的测量位置,以最小化关于 $x$ 的不确定性。该方法依赖于给定当前已观测测量值下对未来测量后验分布的精确建模。最近,扩散模型已被证明能够使用引导扩散生成高质量的高维信号后验样本。在本工作中,我们提出了主动扩散子采样(Active Diffusion Subsampling, ADS),这是一种利用引导扩散执行主动子采样的方法。在该方法中,模型在整个反向扩散过程中追踪关于 $x$ 真实状态的信念分布,通过选择获取具有最大期望熵的测量值来逐步降低其不确定性,并最终生成后验分布 $p(x | y)$。ADS 可以使用预训练的扩散模型应用于任何子采样率,且无需针对特定任务进行重新训练——仅需指定测量模型。此外,ADS 采用的最大熵采样策略具有可解释性,相对于使用黑盒策略的现有方法增强了透明度。实验表明,ADS 优于固定采样策略,并研究了 ADS 在利用 fastMRI 数据集进行磁共振成像加速中的应用,发现 ADS 的性能与监督学习方法相当。代码发布于 https://active-diffusion-subsampling.github.io/。