In this paper, we study the trace regression when a matrix of parameters B* is estimated via the convex relaxation of a rank-regularized regression or via regularized non-convex optimization. It is known that these estimators satisfy near-optimal error bounds under assumptions on the rank, coherence, and spikiness of B*. We start by introducing a general notion of spikiness for B* that provides a generic recipe to prove the restricted strong convexity of the sampling operator of the trace regression and obtain near-optimal and non-asymptotic error bounds for the estimation error. Similar to the existing literature, these results require the regularization parameter to be above a certain theory-inspired threshold that depends on observation noise that may be unknown in practice. Next, we extend the error bounds to cases where the regularization parameter is chosen via cross-validation. This result is significant in that existing theoretical results on cross-validated estimators (Kale et al., 2011; Kumar et al., 2013; Abou-Moustafa and Szepesvari, 2017) do not apply to our setting since the estimators we study are not known to satisfy their required notion of stability. Finally, using simulations on synthetic and real data, we show that the cross-validated estimator selects a near-optimal penalty parameter and outperforms the theory-inspired approach of selecting the parameter.
翻译:在本文中,我们研究了参数矩阵B*通过秩正则化回归的凸松弛或正则化非凸优化进行估计时的迹回归问题。已知在关于B*的秩、相干性和尖峰性假设下,这些估计量满足近最优误差界。我们首先引入B*尖峰性的一般概念,该概念提供了证明迹回归采样算子的受限强凸性的一般方法,并获得了估计误差的近最优非渐近误差界。与现有文献类似,这些结果要求正则化参数高于某个基于理论启发式选择的阈值,该阈值依赖于实践中可能未知的观测噪声。接下来,我们将误差界推广到通过交叉验证选择正则化参数的情形。这一结果具有重要意义,因为现有的关于交叉验证估计量的理论结果(Kale等人,2011;Kumar等人,2013;Abou-Moustafa和Szepesvari,2017)不适用于我们的设定——我们研究的估计量是否满足其所需的稳定性概念尚未可知。最后,通过合成数据和真实数据的模拟实验,我们证明交叉验证估计量能选择近优惩罚参数,并优于基于理论启发式选择的参数选择方法。