Model averaging (MA), a technique for combining estimators from a set of candidate models, has attracted increasing attention in machine learning and statistics. In the existing literature, there is an implicit understanding that MA can be viewed as a form of shrinkage estimation that draws the response vector towards the subspaces spanned by the candidate models. This paper explores this perspective by establishing connections between MA and shrinkage in a linear regression setting with multiple nested models. We first demonstrate that the optimal MA estimator is the best linear estimator with monotone non-increasing weights in a Gaussian sequence model. The Mallows MA, which estimates weights by minimizing the Mallows' $C_p$, is a variation of the positive-part Stein estimator. Motivated by these connections, we develop a novel MA procedure based on a blockwise Stein estimation. Our resulting Stein-type MA estimator is asymptotically optimal across a broad parameter space when the variance is known. Numerical results support our theoretical findings. The connections established in this paper may open up new avenues for investigating MA from different perspectives. A discussion on some topics for future research concludes the paper.
翻译:模型平均(MA)是一种从一组候选模型中组合估计量的技术,近年来在机器学习和统计学领域日益受到关注。现有文献隐含地将MA视为一种收缩估计形式,即把响应向量向候选模型所张成的子空间进行收缩。本文通过在线性回归框架下建立多重嵌套模型场景中MA与收缩之间的联系来深入探究这一视角。我们首先证明,在正态序列模型中,最优MA估计量是具有单调非增权重的最佳线性估计量。Mallows MA通过最小化Mallows的$C_p$来估计权重,是正部Stein估计的一种变体。受这些联系的启发,我们基于分块Stein估计开发了一种新的MA方法。在方差已知的情况下,我们得到的Stein型MA估计量在整个参数空间上具有渐近最优性。数值结果支持了我们的理论发现。本文建立的连接可能为从不同角度研究MA开辟新途径。最后,论文讨论了一些未来研究方向。