Given a matrix $A\in \mathbb{R}^{n\times d}$ and a vector $b\in \mathbb{R}^n$, we consider the regression problem with $\ell_\infty$ guarantees: finding a vector $x'\in \mathbb{R}^d$ such that $ \|x'-x^*\|_\infty \leq \frac{\epsilon}{\sqrt{d}}\cdot \|Ax^*-b\|_2\cdot \|A^\dagger\|$ where $x^*=\arg\min_{x\in \mathbb{R}^d}\|Ax-b\|_2$. One popular approach for solving such $\ell_2$ regression problem is via sketching: picking a structured random matrix $S\in \mathbb{R}^{m\times n}$ with $m\ll n$ and $SA$ can be quickly computed, solve the ``sketched'' regression problem $\arg\min_{x\in \mathbb{R}^d} \|SAx-Sb\|_2$. In this paper, we show that in order to obtain such $\ell_\infty$ guarantee for $\ell_2$ regression, one has to use sketching matrices that are dense. To the best of our knowledge, this is the first user case in which dense sketching matrices are necessary. On the algorithmic side, we prove that there exists a distribution of dense sketching matrices with $m=\epsilon^{-2}d\log^3(n/\delta)$ such that solving the sketched regression problem gives the $\ell_\infty$ guarantee, with probability at least $1-\delta$. Moreover, the matrix $SA$ can be computed in time $O(nd\log n)$. Our row count is nearly-optimal up to logarithmic factors, and significantly improves the result in [Price, Song and Woodruff, ICALP'17], in which a super-linear in $d$ rows, $m=\Omega(\epsilon^{-2}d^{1+\gamma})$ for $\gamma=\Theta(\sqrt{\frac{\log\log n}{\log d}})$ is required. We also develop a novel analytical framework for $\ell_\infty$ guarantee regression that utilizes the Oblivious Coordinate-wise Embedding (OCE) property introduced in [Song and Yu, ICML'21]. Our analysis is arguably much simpler and more general than [Price, Song and Woodruff, ICALP'17], and it extends to dense sketches for tensor product of vectors.
翻译:给定矩阵$A\in \mathbb{R}^{n\times d}$和向量$b\in \mathbb{R}^n$,我们考虑具有$\ell_\infty$保证的回归问题:寻找向量$x'\in \mathbb{R}^d$,使得 $\|x'-x^*\|_\infty \leq \frac{\epsilon}{\sqrt{d}}\cdot \|Ax^*-b\|_2\cdot \|A^\dagger\|$,其中$x^*=\arg\min_{x\in \mathbb{R}^d}\|Ax-b\|_2$。解决此类$\ell_2$回归问题的一种常用方法是通过素描:选择一个结构化的随机矩阵$S\in \mathbb{R}^{m\times n}$(其中$m\ll n$且可快速计算$SA$),并求解“素描化”的回归问题$\arg\min_{x\in \mathbb{R}^d} \|SAx-Sb\|_2$。在本文中,我们表明,为了获得$\ell_2$回归的$\ell_\infty$保证,必须使用稠密素描矩阵。据我们所知,这是首个需要稠密素描矩阵的用例。在算法方面,我们证明存在一种稠密素描矩阵分布,其行数$m=\epsilon^{-2}d\log^3(n/\delta)$,使得求解素描化回归问题能以至少$1-\delta$的概率给出$\ell_\infty$保证。此外,矩阵$SA$可在$O(nd\log n)$时间内计算。我们的行数在除对数因子外的情况下近乎最优,并显著改进了[Price, Song and Woodruff, ICALP'17]的结果,该结果需要$m=\Omega(\epsilon^{-2}d^{1+\gamma})$行(其中$\gamma=\Theta(\sqrt{\frac{\log\log n}{\log d}})$),行数在$d$上呈超线性关系。我们还开发了一种新颖的$\ell_\infty$保证回归分析框架,该框架利用了[Song and Yu, ICML'21]中引入的隐式坐标嵌入(OCE)性质。我们的分析比[Price, Song and Woodruff, ICALP'17]更为简单且更具普适性,并可推广到向量张量积的稠密素描中。