A New Perspective On Denoising Based On Optimal Transport

In the standard formulation of the denoising problem, one is given a probabilistic model relating a latent variable $\Theta \in \Omega \subset \mathbb{R}^m \; (m\ge 1)$ and an observation $Z \in \mathbb{R}^d$ according to: $Z \mid \Theta \sim p(\cdot\mid \Theta)$ and $\Theta \sim G^*$, and the goal is to construct a map to recover the latent variable from the observation. The posterior mean, a natural candidate for estimating $\Theta$ from $Z$, attains the minimum Bayes risk (under the squared error loss) but at the expense of over-shrinking the $Z$, and in general may fail to capture the geometric features of the prior distribution $G^*$ (e.g., low dimensionality, discreteness, sparsity, etc.). To rectify these drawbacks, we take a new perspective on this denoising problem that is inspired by optimal transport (OT) theory and use it to study a different, OT-based, denoiser at the population level setting. We rigorously prove that, under general assumptions on the model, this OT-based denoiser is mathematically well-defined and unique, and is closely connected to the solution to a Monge OT problem. We then prove that, under appropriate identifiability assumptions on the model, the OT-based denoiser can be recovered solely from information of the marginal distribution of $Z$ and the posterior mean of the model, after solving a linear relaxation problem over a suitable space of couplings that is reminiscent of standard multimarginal OT problems. In particular, thanks to Tweedie's formula, when the likelihood model $\{ p(\cdot \mid \theta) \}_{\theta \in \Omega}$ is an exponential family of distributions, the OT based-denoiser can be recovered solely from the marginal distribution of $Z$. In general, our family of OT-like relaxations is of interest in its own right and for the denoising problem suggests alternative numerical methods inspired by the rich literature on computational OT.

翻译：在去噪问题的标准表述中，给定一个概率模型，该模型通过以下方式关联潜在变量 $\Theta \in \Omega \subset \mathbb{R}^m \; (m\ge 1)$ 和观测值 $Z \in \mathbb{R}^d$：$Z \mid \Theta \sim p(\cdot\mid \Theta)$ 且 $\Theta \sim G^*$，目标是构建一个从观测值中恢复潜在变量的映射。后验均值作为从 $Z$ 估计 $\Theta$ 的自然候选者，在平方误差损失下达到了最小贝叶斯风险，但代价是过度收缩 $Z$，并且通常可能无法捕捉先验分布 $G^*$ 的几何特征（例如，低维性、离散性、稀疏性等）。为了纠正这些缺点，我们受最优传输理论启发，从一个新的视角审视这一去噪问题，并利用该视角在总体水平设置下研究一种不同的、基于最优传输的去噪器。我们严格证明，在模型的一般假设下，这种基于最优传输的去噪器在数学上是良定义且唯一的，并且与一个蒙日最优传输问题的解密切相关。然后我们证明，在模型适当的可识别性假设下，基于最优传输的去噪器可以仅从 $Z$ 的边缘分布和模型的后验均值信息中恢复，前提是解决一个在合适的耦合空间上的线性松弛问题，该问题让人联想到标准的多边际最优传输问题。特别地，得益于Tweedie公式，当似然模型 $\{ p(\cdot \mid \theta) \}_{\theta \in \Omega}$ 是指数族分布时，基于最优传输的去噪器可以仅从 $Z$ 的边缘分布中恢复。总体而言，我们这类最优传输式松弛问题本身就具有研究价值，并且对于去噪问题，它提出了受丰富的计算最优传输文献启发的替代数值方法。