We present a novel quantum high-dimensional linear regression algorithm with an $\ell_1$-penalty based on the classical LARS (Least Angle Regression) pathwise algorithm. Similarly to available classical algorithms for Lasso, our quantum algorithm provides the full regularisation path as the penalty term varies, but quadratically faster per iteration under specific conditions. A quadratic speedup on the number of features $d$ is possible by using the quantum minimum-finding routine from D\"urr and Hoyer (arXiv'96) in order to obtain the joining time at each iteration. We then improve upon this simple quantum algorithm and obtain a quadratic speedup both in the number of features $d$ and the number of observations $n$ by using the approximate quantum minimum-finding routine from Chen and de Wolf (ICALP'23). As one of our main contributions, we construct a quantum unitary to approximately compute the joining times to be searched over by the approximate quantum minimum finding. Since the joining times are no longer exactly computed, it is no longer clear that the resulting approximate quantum algorithm obtains a good solution. As our second main contribution, we prove, via an approximate version of the KKT conditions and a duality gap, that the LARS algorithm (and thus our quantum algorithm) is robust to errors. This means that it still outputs a path that minimises the Lasso cost function up to a small error if the joining times are approximately computed. Moreover, we show that, when the observations are sampled from a Gaussian distribution, our quantum algorithm's complexity only depends polylogarithmically on $n$, exponentially better than the classical LARS algorithm, while keeping the quadratic improvement on $d$. Finally, we propose a dequantised algorithm that also retains the polylogarithmic dependence on $n$, albeit with the linear scaling on $d$ from the standard LARS algorithm.
翻译:我们提出了一种新颖的量子高维线性回归算法,该算法基于经典LARS(最小角回归)路径算法,并引入$\ell_1$惩罚项。与现有的经典Lasso算法类似,我们的量子算法能够提供惩罚项变化时的完整正则化路径,且在特定条件下每次迭代具有二次加速效果。通过采用Dürr和Hoyer(arXiv'96)提出的量子最小值查找程序来确定每次迭代的加入时间,我们实现了特征数量$d$的二次加速。随后,我们对此基础量子算法进行改进,通过引入Chen和de Wolf(ICALP'23)提出的近似量子最小值查找程序,同时在特征数量$d$和观测样本数量$n$上获得二次加速。作为主要贡献之一,我们构建了一个量子酉算子来近似计算待搜索的加入时间。由于加入时间不再被精确计算,近似量子算法能否获得优质解变得不再显然。作为第二个主要贡献,我们通过近似KKT条件和对偶间隙证明了LARS算法(及我们的量子算法)具有误差鲁棒性。这意味着即使加入时间被近似计算,算法仍能输出使Lasso代价函数最小化至微小误差的路径。进一步研究表明,当观测数据采样自高斯分布时,我们的量子算法复杂度仅与$n$呈多对数关系,相较于经典LARS算法实现指数级改进,同时保持对$d$的二次加速优势。最后,我们提出一种去量子化算法,该算法虽保持标准LARS算法对$d$的线性依赖,但仍能保留对$n$的多对数依赖特性。