Precise Performance of Linear Denoisers in the Proportional Regime

In the present paper we study the performance of linear denoisers for noisy data of the form $\mathbf{x} + \mathbf{z}$, where $\mathbf{x} \in \mathbb{R}^d$ is the desired data with zero mean and unknown covariance $\mathbfΣ$, and $\mathbf{z} \sim \mathcal{N}(0, \mathbfΣ_{\mathbf{z}})$ is additive noise. Since the covariance $\mathbfΣ$ is not known, the standard Wiener filter cannot be employed for denoising. Instead we assume we are given samples $\mathbf{x}_1,\dots,\mathbf{x}_n \in \mathbb{R}^d$ from the true distribution. A standard approach would then be to estimate $\mathbfΣ$ from the samples and use it to construct an ``empirical" Wiener filter. However, in this paper, motivated by the denoising step in diffusion models, we take a different approach whereby we train a linear denoiser $\mathbf{W}$ from the data itself. In particular, we synthetically construct noisy samples $\hat{\mathbf{x}}_i$ of the data by injecting the samples with Gaussian noise with covariance $\mathbfΣ_1 \neq \mathbfΣ_{\mathbf{z}}$ and find the best $\mathbf{W}$ that approximates $\mathbf{W}\hat{\mathbf{x}}_i \approx \mathbf{x}_i$ in a least-squares sense. In the proportional regime $\frac{n}{d} \rightarrow κ> 1$ we use the {\it Convex Gaussian Min-Max Theorem (CGMT)} to analytically find the closed form expression for the generalization error of the denoiser obtained from this process. Using this expression one can optimize over $\mathbfΣ_1$ to find the best possible denoiser. Our numerical simulations show that our denoiser outperforms the ``empirical" Wiener filter in many scenarios and approaches the optimal Wiener filter as $κ\rightarrow\infty$.

翻译：本文研究了形如 $\mathbf{x} + \mathbf{z}$ 的含噪数据线性去噪器的性能，其中 $\mathbf{x} \in \mathbb{R}^d$ 为期望数据，具有零均值及未知协方差矩阵 $\mathbfΣ$，$\mathbf{z} \sim \mathcal{N}(0, \mathbfΣ_{\mathbf{z}})$ 为加性噪声。由于协方差矩阵 $\mathbfΣ$ 未知，无法采用标准维纳滤波器进行去噪。为此，我们假设已从真实分布中获取样本 $\mathbf{x}_1,\dots,\mathbf{x}_n \in \mathbb{R}^d$。传统方法是通过样本估计 $\mathbfΣ$ 并构建“经验”维纳滤波器。然而，受扩散模型去噪步骤启发，本文采取不同途径：直接从数据本身训练线性去噪器 $\mathbf{W}$。具体而言，我们通过向样本注入协方差矩阵为 $\mathbfΣ_1 \neq \mathbfΣ_{\mathbf{z}}$ 的高斯噪声，人工合成含噪样本 $\hat{\mathbf{x}}_i$，并在最小二乘准则下求解最优 $\mathbf{W}$ 使得 $\mathbf{W}\hat{\mathbf{x}}_i \approx \mathbf{x}_i$。在比例极限 $\frac{n}{d} \rightarrow κ > 1$ 条件下，我们运用凸高斯极小极大定理(CGMT)解析推导该过程所得去噪器泛化误差的闭式表达式。利用该表达式可对 $\mathbfΣ_1$ 进行优化以获得最优去噪器。数值仿真表明，我们的去噪器在多数场景下优于“经验”维纳滤波器，并随 $κ\rightarrow\infty$ 趋近于最优维纳滤波器。