We initiate the study of nonparametric empirical Bayes denoising methods in the setting where both the latent variables and their measurements lie on a compact Riemannian manifold, and where the likelihood is a Riemannian Gaussian distribution. Our starting point is a novel Tweedie-Eddington formula for Riemannian Gaussian mixture models which identifies a certain surrogate oracle denoiser in terms of the marginal distribution of the measurements; it avoids the explicit computation of the posterior Fréchet mean (as required by the Bayes denoiser) via a first-order approximation, hence we refer to it as the "tangential" Bayes denoiser. We show that this surrogate oracle achieves nearly the Bayes risk in a low-noise regime, we construct a fully data-driven approximation of it using the spectral theory of the Laplace-Beltrami operator, and we establish finite-sample rates of convergence for the distance between the the surrogate oracle and its approximation. Contrasting the nearly-parametric rates from the Euclidean setting, the rates in the Riemannian setting are slower due to the singularities of the Riemannian Gaussian density at the cut locus of its Fréchet mean; in the special case of the circle we establish matching lower bounds which show that our proposed denoiser is minimax-optimal, and that the denoising problem exhibits a genuinely nonparametric rate of convergence. Lastly, we implement our methodology in two scientific applications: in astronomy, the sphere-valued problem of denoising the locations of gamma ray bursts; in structural biology, the torus-valued problem of denoising pairs of torsion angles of adjacent amino acids in a protein (i.e., the Ramachandran plot).
翻译:我们首次在潜在变量及其测量均位于紧致黎曼流形上,且似然函数为黎曼高斯分布的设定下,研究非参数经验贝叶斯去噪方法。我们的出发点是针对黎曼高斯混合模型的一个新颖Tweedie-Eddington公式,该公式通过测量边际分布识别出某个替代性的先知去噪器;它通过一阶近似避免了后验Fréchet均值(贝叶斯去噪器所需)的显式计算,因此我们称之为“切向”贝叶斯去噪器。我们证明此替代先知在低噪声区域内接近贝叶斯风险,利用拉普拉斯-贝尔特拉米算子的谱理论构造其完全数据驱动的近似,并确立了替代先知与其近似之间距离的有限样本收敛速率。与欧几里得设定中的近参数化速率相比,黎曼设定中的速率因黎曼高斯密度在其Fréchet均值割迹处的奇异性而更慢;在圆这一特殊情形下,我们建立了匹配的下界,表明所提出的去噪器是极小化最优的,且去噪问题呈现出真正的非参数收敛速率。最后,我们将方法应用于两项科学实践:在天文学中,处理去噪伽马射线暴位置(球面值问题);在结构生物学中,处理去噪蛋白质相邻氨基酸扭转角对(即拉马钱德兰图)的环面值问题。