We study oblivious sketching for $k$-sparse linear regression under various loss functions such as an $\ell_p$ norm, or from a broad class of hinge-like loss functions, which includes the logistic and ReLU losses. We show that for sparse $\ell_2$ norm regression, there is a distribution over oblivious sketches with $\Theta(k\log(d/k)/\varepsilon^2)$ rows, which is tight up to a constant factor. This extends to $\ell_p$ loss with an additional additive $O(k\log(k/\varepsilon)/\varepsilon^2)$ term in the upper bound. This establishes a surprising separation from the related sparse recovery problem, which is an important special case of sparse regression. For this problem, under the $\ell_2$ norm, we observe an upper bound of $O(k \log (d)/\varepsilon + k\log(k/\varepsilon)/\varepsilon^2)$ rows, showing that sparse recovery is strictly easier to sketch than sparse regression. For sparse regression under hinge-like loss functions including sparse logistic and sparse ReLU regression, we give the first known sketching bounds that achieve $o(d)$ rows showing that $O(\mu^2 k\log(\mu n d/\varepsilon)/\varepsilon^2)$ rows suffice, where $\mu$ is a natural complexity parameter needed to obtain relative error bounds for these loss functions. We again show that this dimension is tight, up to lower order terms and the dependence on $\mu$. Finally, we show that similar sketching bounds can be achieved for LASSO regression, a popular convex relaxation of sparse regression, where one aims to minimize $\|Ax-b\|_2^2+\lambda\|x\|_1$ over $x\in\mathbb{R}^d$. We show that sketching dimension $O(\log(d)/(\lambda \varepsilon)^2)$ suffices and that the dependence on $d$ and $\lambda$ is tight.
翻译:我们研究了在多种损失函数(如ℓₚ范数,或包含逻辑损失和ReLU损失在内的广义铰链型损失函数)下,针对k-稀疏线性回归的遗忘式草图绘制方法。我们证明,对于稀疏ℓ₂范数回归,存在一个行数为Θ(k log(d/k)/ε²)的遗忘式草图分布,且该结果在常数因子意义下是紧的。这一结果可推广至ℓₚ损失,但上界需附加O(k log(k/ε)/ε²)项。这建立了与相关稀疏恢复问题(稀疏回归的重要特例)的显著分离性:对于ℓ₂范数下的稀疏恢复,我们观察到上界为O(k log(d)/ε + k log(k/ε)/ε²)行,表明稀疏恢复比稀疏回归更易草图化。针对含铰链型损失函数的稀疏回归(包括稀疏逻辑回归和稀疏ReLU回归),我们首次给出了o(d)行草图绘制界,证明O(μ² k log(μ n d/ε)/ε²)行即足够,其中μ是为获得这些损失函数的相对误差界所需的自然复杂度参数。我们再次证明此维度是紧的(除低阶项及对μ的依赖性外)。最后,我们证明对于LASSO回归(一种流行的稀疏回归凸松弛方法,目标是在x∈ℝᵈ上最小化‖Ax-b‖₂²+λ‖x‖₁),可达到类似的草图绘制界:O(log(d)/(λ ε)²)行即足够,且对d和λ的依赖性也是紧的。