Sparse recovery is among the most well-studied problems in learning theory and high-dimensional statistics. In this work, we investigate the statistical and computational landscapes of sparse recovery with $\ell_\infty$ error guarantees. This variant of the problem is motivated by \emph{variable selection} tasks, where the goal is to estimate the support of a $k$-sparse signal in $\mathbb{R}^d$. Our main contribution is a provable separation between the \emph{oblivious} (``for each'') and \emph{adaptive} (``for all'') models of $\ell_\infty$ sparse recovery. We show that under an oblivious model, the optimal $\ell_\infty$ error is attainable in near-linear time with $\approx k\log d$ samples, whereas in an adaptive model, $\gtrsim k^2$ samples are necessary for any algorithm to achieve this bound. This establishes a surprising contrast with the standard $\ell_2$ setting, where $\approx k \log d$ samples suffice even for adaptive sparse recovery. We conclude with a preliminary examination of a \emph{partially-adaptive} model, where we show nontrivial variable selection guarantees are possible with $\approx k\log d$ measurements.
翻译:稀疏恢复是学习理论和高维统计学中研究最深入的问题之一。本文研究了具有$\ell_\infty$误差保证的稀疏恢复问题的统计与计算特性。该问题变体的动机源于\emph{变量选择}任务,其目标是估计$\mathbb{R}^d$中$k$稀疏信号的支撑集。我们的主要贡献在于严格分离了$\ell_\infty$稀疏恢复的\emph{遗忘}模型(“逐个”模型)与\emph{自适应}模型(“全体”模型)。我们证明在遗忘模型下,最优$\ell_\infty$误差可通过$\approx k\log d$个样本在近线性时间内实现;而在自适应模型中,任何算法要达到该误差界都需要$\gtrsim k^2$个样本。这与标准$\ell_2$设定形成显著对比——在$\ell_2$设定中,即使对于自适应稀疏恢复,$\approx k \log d$个样本也已足够。最后,我们对\emph{部分自适应}模型进行了初步探讨,证明了通过$\approx k\log d$次测量即可实现非平凡的变量选择保证。