We describe a puzzle involving the interactions between an optimization of a multivariate quadratic function and a "plug-in" estimator of a spiked covariance matrix. When the largest eigenvalues (i.e., the spikes) diverge with the dimension, the gap between the true and the out-of-sample optima typically also diverges. We show how to "fine-tune" the plug-in estimator in a precise way to avoid this outcome. Central to our description is a "quadratic optimization bias" function, the roots of which determine this fine-tuning property. We derive an estimator of this root from a finite number of observations of a high dimensional vector. This leads to a new covariance estimator designed specifically for applications involving quadratic optimization. Our theoretical results have further implications for improving low dimensional representations of data, and principal component analysis in particular.
翻译:本文探讨了一个涉及多元二次函数优化与“插件”式尖峰协方差矩阵估计量之间相互作用的难题。当最大特征值(即尖峰)随维度发散时,真实最优解与样本外最优解之间的差距通常也会发散。我们展示了如何以精确方式“微调”插件估计量以避免这一结果。描述的核心是一个“二次优化偏差”函数,其根决定了这种微调特性。我们通过有限个高维向量观测值推导出该根的估计量,从而得到一种专门为涉及二次优化的应用场景设计的新型协方差估计量。我们的理论结果对改进数据的低维表示(特别是主成分分析)具有进一步的意义。