Model averaging (MA), a technique for combining estimators from a set of candidate models, has attracted increasing attention in machine learning and statistics. In the existing literature, there is an implicit understanding that MA can be viewed as a form of shrinkage estimation that draws the response vector towards the subspaces spanned by the candidate models. This paper explores this perspective by establishing connections between MA and shrinkage in a linear regression setting with multiple nested models. We first demonstrate that the optimal MA estimator is the best linear estimator with monotonically non-increasing weights in a Gaussian sequence model. The Mallows MA (MMA), which estimates weights by minimizing the Mallows' $C_p$ over the unit simplex, can be viewed as a variation of the sum of a set of positive-part Stein estimators. Indeed, the latter estimator differs from the MMA only in that its optimization of Mallows' $C_p$ is within a suitably relaxed weight set. Motivated by these connections, we develop a novel MA procedure based on a blockwise Stein estimation. The resulting Stein-type MA estimator is asymptotically optimal across a broad parameter space when the variance is known. Numerical results support our theoretical findings. The connections established in this paper may open up new avenues for investigating MA from different perspectives. A discussion on some topics for future research concludes the paper.
翻译:模型平均(MA)是一种整合多个候选模型估计量的技术,近年来在机器学习和统计学领域日益受到关注。现有文献中隐含着一个共识:MA可视为一种收缩估计方法,它将响应向量朝向候选模型所张成的子空间进行压缩。本文通过建立线性回归框架下多层嵌套模型中MA与收缩估计的内在联系,深入探讨了这一视角。我们首先证明,在高斯序列模型中,最优MA估计量是权重呈单调非增性质的最优线性估计量。通过最小化单位单纯形上的Mallows $C_p$ 准则来估计权重的Mallows模型平均(MMA)方法,可被视为一组正部Stein估计量之和的变体。实际上,后者与MMA的区别仅在于其对Mallows $C_p$ 的优化是在一个适当放宽的权重集合中进行的。受这些联系的启发,我们基于分块Stein估计提出了一种新型MA方法。当方差已知时,所提出的Stein型MA估计量在广阔参数空间上具有渐近最优性。数值实验结果支持了我们的理论发现。本文建立的关联可能为从不同视角研究MA开辟新途径。最后,我们对若干未来研究方向进行了探讨。