Optimal Sketching Bounds for Sparse Linear Regression

We study oblivious sketching for $k$-sparse linear regression under various loss functions such as an $\ell_p$ norm, or from a broad class of hinge-like loss functions, which includes the logistic and ReLU losses. We show that for sparse $\ell_2$ norm regression, there is a distribution over oblivious sketches with $\Theta(k\log(d/k)/\varepsilon^2)$ rows, which is tight up to a constant factor. This extends to $\ell_p$ loss with an additional additive $O(k\log(k/\varepsilon)/\varepsilon^2)$ term in the upper bound. This establishes a surprising separation from the related sparse recovery problem, which is an important special case of sparse regression. For this problem, under the $\ell_2$ norm, we observe an upper bound of $O(k \log (d)/\varepsilon + k\log(k/\varepsilon)/\varepsilon^2)$ rows, showing that sparse recovery is strictly easier to sketch than sparse regression. For sparse regression under hinge-like loss functions including sparse logistic and sparse ReLU regression, we give the first known sketching bounds that achieve $o(d)$ rows showing that $O(\mu^2 k\log(\mu n d/\varepsilon)/\varepsilon^2)$ rows suffice, where $\mu$ is a natural complexity parameter needed to obtain relative error bounds for these loss functions. We again show that this dimension is tight, up to lower order terms and the dependence on $\mu$. Finally, we show that similar sketching bounds can be achieved for LASSO regression, a popular convex relaxation of sparse regression, where one aims to minimize $\|Ax-b\|_2^2+\lambda\|x\|_1$ over $x\in\mathbb{R}^d$. We show that sketching dimension $O(\log(d)/(\lambda \varepsilon)^2)$ suffices and that the dependence on $d$ and $\lambda$ is tight.

翻译：我们研究了在多种损失函数（如ℓₚ范数，或包含逻辑损失和ReLU损失在内的广义铰链型损失函数）下，针对k-稀疏线性回归的遗忘式草图绘制方法。我们证明，对于稀疏ℓ₂范数回归，存在一个行数为Θ(k log(d/k)/ε²)的遗忘式草图分布，且该结果在常数因子意义下是紧的。这一结果可推广至ℓₚ损失，但上界需附加O(k log(k/ε)/ε²)项。这建立了与相关稀疏恢复问题（稀疏回归的重要特例）的显著分离性：对于ℓ₂范数下的稀疏恢复，我们观察到上界为O(k log(d)/ε + k log(k/ε)/ε²)行，表明稀疏恢复比稀疏回归更易草图化。针对含铰链型损失函数的稀疏回归（包括稀疏逻辑回归和稀疏ReLU回归），我们首次给出了o(d)行草图绘制界，证明O(μ² k log(μ n d/ε)/ε²)行即足够，其中μ是为获得这些损失函数的相对误差界所需的自然复杂度参数。我们再次证明此维度是紧的（除低阶项及对μ的依赖性外）。最后，我们证明对于LASSO回归（一种流行的稀疏回归凸松弛方法，目标是在x∈ℝᵈ上最小化‖Ax-b‖₂²+λ‖x‖₁），可达到类似的草图绘制界：O(log(d)/(λ ε)²)行即足够，且对d和λ的依赖性也是紧的。