This study investigates a statistical property of Lagrange multipliers in constrained Maximum Likelihood Estimation (MLE) and Least Squares (LS) problems from the perspective of numerical optimization. Building on large-sample theory, we show that the associated Lagrange multipliers converge to zero as the sample size increases, provided the distribution is correctly specified in MLE or the residuals are normally distributed in LS. Although this asymptotic behavior has long been recognized in statistics, it has received little explicit attention in numerical optimization and has rarely been exploited in algorithmic design. Importantly, the insight extends beyond classical low-dimensional settings: even in modern high-dimensional applications, such as deep learning, where the number of parameters may exceed the sample size, the same reasoning applies provided the generalization performance is good. This observation has two main implications. First, many constrained optimization algorithms, including the Augmented Lagrangian Method, Sequential Quadratic Programming, and Interior Point methods, require initial values for the multipliers, and choosing zero is statistically justified. Numerical experiments for constrained regressions and dynamic discrete choice model estimations support this implication by showing that initializing multipliers at zero usually lead to stable and efficient performance. Second, penalty-based approaches that convert constrained problems into unconstrained ones can perform well when the true multipliers are small. This helps explain why penalty-based methods often perform well in practice.
翻译:本研究从数值优化视角探讨了带约束最大似然估计与最小二乘问题中拉格朗日乘子的统计特性。基于大样本理论,我们证明了当最大似然估计中分布设定正确或最小二乘中残差服从正态分布时,相关拉格朗日乘子随样本量增加收敛于零。尽管这一渐近行为在统计学领域早已得到公认,但在数值优化中却鲜有明确关注,且极少被应用于算法设计。重要的是,该发现不仅适用于经典的低维场景:即使在参数数量可能超过样本量的现代高维应用(如深度学习)中,只要泛化性能良好,同一推理依然成立。这一结论具有两个主要指导意义。首先,包括增广拉格朗日法、序列二次规划法及内点法在内的许多带约束优化算法需要为乘子提供初始值,而选择零初始值具备统计合理性。针对带约束回归及动态离散选择模型估计的数值实验表明,将乘子初始化为零通常能实现稳定且高效的性能。其次,将约束问题转化为无约束问题的惩罚类方法在真实乘子较小的情况下表现良好,这解释了为何惩罚方法在实践中往往能展现优异性能。