This study develops a higher-order asymptotic framework for test-time adaptation (TTA) of Batch Normalization (BN) statistics under distribution shift by integrating classical Edgeworth expansion and saddlepoint approximation techniques with a novel one-step M-estimation perspective. By analyzing the statistical discrepancy between training and test distributions, we derive an Edgeworth expansion for the normalized difference in BN means and obtain an optimal weighting parameter that minimizes the mean-squared error of the adapted statistic. Reinterpreting BN TTA as a one-step M-estimator allows us to derive higher-order local asymptotic normality results, which incorporate skewness and other higher moments into the estimator's behavior. Moreover, we quantify the trade-offs among bias, variance, and skewness in the adaptation process and establish a corresponding generalization bound on the model risk. The refined saddlepoint approximations further deliver uniformly accurate density and tail probability estimates for the BN TTA statistic. These theoretical insights provide a comprehensive understanding of how higher-order corrections and robust one-step updating can enhance the reliability and performance of BN layers in adapting to changing data distributions.
翻译:本研究通过将经典的Edgeworth展开与鞍点近似技术相结合,并引入新颖的一步M估计视角,构建了分布偏移下批归一化统计量测试时自适应的高阶渐近理论框架。通过分析训练分布与测试分布间的统计差异,我们推导了BN均值归一化差异的Edgeworth展开式,并得到使自适应统计量均方误差最小化的最优加权参数。将BN TTA重新解释为一步M估计量,使我们能够推导出高阶局部渐近正态性结果,该结果将偏度与其他高阶矩纳入估计量的行为分析。此外,我们量化了自适应过程中偏差、方差与偏度之间的权衡关系,并建立了相应的模型风险泛化界。改进的鞍点近似方法进一步为BN TTA统计量提供了均匀精确的概率密度与尾概率估计。这些理论成果系统揭示了高阶修正与稳健一步更新如何提升BN层适应动态数据分布的可靠性与性能。