We present a novel quantum high-dimensional linear regression algorithm with an $\ell_1$-penalty based on the classical LARS (Least Angle Regression) pathwise algorithm. Similarly to available classical numerical algorithms for Lasso, our quantum algorithm provides the full regularisation path as the penalty term varies, but quadratically faster per iteration under specific conditions. A quadratic speedup on the number of features/predictors $d$ is possible by using the simple quantum minimum-finding subroutine from D\"urr and Hoyer (arXiv'96) in order to obtain the joining time at each iteration. We then improve upon this simple quantum algorithm and obtain a quadratic speedup both in the number of features $d$ and the number of observations $n$ by using the recent approximate quantum minimum-finding subroutine from Chen and de Wolf (ICALP'23). As one of our main contributions, we construct a quantum unitary based on quantum amplitude estimation to approximately compute the joining times to be searched over by the approximate quantum minimum finding. Since the joining times are no longer exactly computed, it is no longer clear that the resulting approximate quantum algorithm obtains a good solution. As our second main contribution, we prove, via an approximate version of the KKT conditions and a duality gap, that the LARS algorithm (and therefore our quantum algorithm) is robust to errors. This means that it still outputs a path that minimises the Lasso cost function up to a small error if the joining times are only approximately computed. Finally, in the model where the observations are generated by an underlying linear model with an unknown coefficient vector, we prove bounds on the difference between the unknown coefficient vector and the approximate Lasso solution, which generalises known results about convergence rates in classical statistical learning theory analysis.
翻译:我们提出了一种新颖的量子高维线性回归算法,该算法基于经典LARS(最小角回归)路径算法并采用ℓ₁罚项。与现有的经典Lasso数值算法类似,我们的量子算法在罚项变化时能够提供完整的正则化路径,但在特定条件下每轮迭代速度呈二次加速。通过使用Dürr和Hoyer(arXiv'96)提出的简单量子最小值查找子程序来获取每次迭代的连接时间,可以在特征/预测变量数量d上实现二次加速。随后,我们改进这一简单量子算法,采用Chen和de Wolf(ICALP'23)提出的近似量子最小值查找子程序,在特征数量d和观测数量n上均实现了二次加速。作为主要贡献之一,我们基于量子幅度估计构造了量子酉算子,用于近似计算近似量子最小值查找所需搜索的连接时间。由于连接时间不再精确计算,由此产生的近似量子算法能否获得良好解尚不明确。作为第二个主要贡献,我们通过KKT条件的近似形式和其对偶间隙证明LARS算法(因此也包括我们的量子算法)对误差具有鲁棒性。这意味着即使连接时间仅为近似计算,算法仍能输出使Lasso代价函数达到小误差的路径。最后,在观测由未知系数向量的底层线性模型生成的场景中,我们证明了未知系数向量与近似Lasso解之间差异的边界,该结果推广了经典统计学习理论中关于收敛速度的已知结论。