Robust Mean Estimation Without Moments for Symmetric Distributions

We study the problem of robustly estimating the mean or location parameter without moment assumptions. We show that for a large class of symmetric distributions, the same error as in the Gaussian setting can be achieved efficiently. The distributions we study include products of arbitrary symmetric one-dimensional distributions, such as product Cauchy distributions, as well as elliptical distributions. For product distributions and elliptical distributions with known scatter (covariance) matrix, we show that given an $\varepsilon$-corrupted sample, we can with probability at least $1-\delta$ estimate its location up to error $O(\varepsilon \sqrt{\log(1/\varepsilon)})$ using $\tfrac{d\log(d) + \log(1/\delta)}{\varepsilon^2 \log(1/\varepsilon)}$ samples. This result matches the best-known guarantees for the Gaussian distribution and known SQ lower bounds (up to the $\log(d)$ factor). For elliptical distributions with unknown scatter (covariance) matrix, we propose a sequence of efficient algorithms that approaches this optimal error. Specifically, for every $k \in \mathbb{N}$, we design an estimator using time and samples $\tilde{O}({d^k})$ achieving error $O(\varepsilon^{1-\frac{1}{2k}})$. This matches the error and running time guarantees when assuming certifiably bounded moments of order up to $k$. For unknown covariance, such error bounds of $o(\sqrt{\varepsilon})$ are not even known for (general) sub-Gaussian distributions. Our algorithms are based on a generalization of the well-known filtering technique. We show how this machinery can be combined with Huber-loss-based techniques to work with projections of the noise that behave more nicely than the initial noise. Moreover, we show how SoS proofs can be used to obtain algorithmic guarantees even for distributions without a first moment. We believe that this approach may find other applications in future works.

翻译：我们研究在无矩假设条件下对均值或位置参数进行鲁棒估计的问题。我们证明，对于一大类对称分布，可以达到与高斯设定相同的误差，且算法高效。所研究的分布包括任意对称一维分布的乘积（如柯西分布乘积）以及椭圆分布。对于已知散布（协方差）矩阵的乘积分布和椭圆分布，我们证明：给定一个$\varepsilon$污染的样本，使用$\tfrac{d\log(d) + \log(1/\delta)}{\varepsilon^2 \log(1/\varepsilon)}$个样本，能以至少$1-\delta$的概率估计其位置，误差至多为$O(\varepsilon \sqrt{\log(1/\varepsilon)})$。这一结果匹配了高斯分布的最优已知保证以及SQ下界（在$\log(d)$因子内）。对于未知散布（协方差）矩阵的椭圆分布，我们提出一系列接近该最优误差的高效算法。具体而言，对每个$k \in \mathbb{N}$，我们设计了一个使用$\tilde{O}({d^k})$时间和样本的估计器，达到$O(\varepsilon^{1-\frac{1}{2k}})$的误差。这匹配了假设存在至多$k$阶可证有界矩时的误差与运行时间保证。对于未知协方差的情况，即使对（一般）次高斯分布，这样的$o(\sqrt{\varepsilon})$误差界此前也未被知晓。我们的算法基于对经典过滤技术的推广。我们展示了如何将该机制与基于Huber损失的技术结合，以处理比初始噪声表现更良好的投影噪声。此外，我们展示了如何利用SoS证明来为甚至没有一阶矩的分布提供算法保证。我们相信这一方法在未来工作中可能找到其他应用。