Robust Sparse Regression with Non-Isotropic Designs

We develop a technique to design efficiently computable estimators for sparse linear regression in the simultaneous presence of two adversaries: oblivious and adaptive. We design several robust algorithms that outperform the state of the art even in the special case when oblivious adversary simply adds Gaussian noise. In particular, we provide a polynomial-time algorithm that with high probability recovers the signal up to error $O(\sqrt{\varepsilon})$ as long as the number of samples $n \ge \tilde{O}(k^2/\varepsilon)$, only assuming some bounds on the third and the fourth moments of the distribution ${D}$ of the design. In addition, prior to this work, even in the special case of Gaussian design and noise, no polynomial time algorithm was known to achieve error $o(\sqrt{\varepsilon})$ in the sparse setting $n < d^2$. We show that under some assumptions on the fourth and the eighth moments of ${D}$, there is a polynomial-time algorithm that achieves error $o(\sqrt{\varepsilon})$ as long as $n \ge \tilde{O}(k^4 / \varepsilon^3)$. For Gaussian distribution, this algorithm achieves error $O(\varepsilon^{3/4})$. Moreover, our algorithm achieves error $o(\sqrt{\varepsilon})$ for all log-concave distributions if $\varepsilon \le 1/\text{polylog(d)}$. Our algorithms are based on the filtering of the covariates that uses sum-of-squares relaxations, and weighted Huber loss minimization with $\ell_1$ regularizer. We provide a novel analysis of weighted penalized Huber loss that is suitable for heavy-tailed designs in the presence of two adversaries. Furthermore, we complement our algorithmic results with Statistical Query lower bounds, providing evidence that our estimators are likely to have nearly optimal sample complexity.

翻译：我们开发了一种技术，用于在同时存在两种对抗者（遗忘型与自适应型）的情况下，为稀疏线性回归设计可高效计算的估计器。我们设计了多种鲁棒算法，即使在遗忘型对抗者仅添加高斯噪声的特殊情况下，其性能也优于现有技术。具体而言，我们提供了一种多项式时间算法，只要样本数量 $n \ge \tilde{O}(k^2/\varepsilon)$，该算法就能以高概率将信号恢复至误差 $O(\sqrt{\varepsilon})$ 以内，且仅需假设设计分布 ${D}$ 的三阶矩和四阶矩存在某些界。此外，在本工作之前，即使在高斯设计与高斯噪声的特殊情况下，也没有已知的多项式时间算法能在稀疏设定 $n < d^2$ 中实现 $o(\sqrt{\varepsilon})$ 的误差。我们证明，在对分布 ${D}$ 的四阶矩和八阶矩施加某些假设的条件下，存在一种多项式时间算法，只要 $n \ge \tilde{O}(k^4 / \varepsilon^3)$，即可实现 $o(\sqrt{\varepsilon})$ 的误差。对于高斯分布，该算法可实现 $O(\varepsilon^{3/4})$ 的误差。此外，如果 $\varepsilon \le 1/\text{polylog(d)}$，我们的算法对所有对数凹分布均能实现 $o(\sqrt{\varepsilon})$ 的误差。我们的算法基于使用平方和松弛的协变量过滤，以及结合 $\ell_1$ 正则项的加权 Huber 损失最小化。我们提出了一种适用于存在两种对抗者的重尾设计的加权惩罚 Huber 损失的新颖分析。此外，我们通过统计查询下界补充了算法结果，为我们的估计器可能具有近乎最优的样本复杂度提供了证据。