We consider the problem of recovering a latent signal $X$ from its noisy observation $Y$. The unknown law $\mathbb{P}^X$ of $X$, and in particular its support $\mathscr{M}$, are accessible only through a large sample of i.i.d.\ observations. We further assume $\mathscr{M}$ to be a low-dimensional submanifold of a high-dimensional Euclidean space $\mathbb{R}^d$. As a filter or denoiser $\widehat X$, we suggest an estimator of the metric projection $\pi_{\mathscr{M}}(Y)$ of $Y$ onto the manifold $\mathscr{M}$. To compute this estimator, we study an auxiliary semiparametric model in which $Y$ is obtained by adding isotropic Laplace noise to $X$. Using score matching within a corresponding diffusion model, we obtain an estimator of the Bayesian posterior $\mathbb{P}^{X \mid Y}$ in this setup. Our main theoretical results show that, in the limit of high dimension $d$, this posterior $\mathbb{P}^{X\mid Y}$ is concentrated near the desired metric projection $\pi_{\mathscr{M}}(Y)$.
翻译:我们考虑从含噪观测$Y$中恢复潜在信号$X$的问题。$X$的未知分布$\mathbb{P}^X$及其支撑集$\mathscr{M}$仅能通过大量独立同分布观测样本进行估计。我们进一步假设$\mathscr{M}$是高维欧几里得空间$\mathbb{R}^d$中的低维子流形。作为滤波器或去噪器$\widehat X$,我们提出一种对$Y$到流形$\mathscr{M}$的度量投影$\pi_{\mathscr{M}}(Y)$的估计量。为计算该估计量,我们研究了一个辅助半参数模型,其中$Y$通过向$X$添加各向同性拉普拉斯噪声获得。通过在相应扩散模型中进行分数匹配,我们得到了该设置下贝叶斯后验$\mathbb{P}^{X \mid Y}$的估计量。我们的主要理论结果表明,在高维极限$d$下,此后验$\mathbb{P}^{X\mid Y}$集中于目标度量投影$\pi_{\mathscr{M}}(Y)$附近。