Firth (1993, Biometrika) shows that the maximum Jeffreys' prior penalized likelihood estimator in logistic regression has asymptotic bias decreasing with the square of the number of observations when the number of parameters is fixed, which is an order faster than the typical rate from maximum likelihood. The widespread use of that estimator in applied work is supported by the results in Kosmidis and Firth (2021, Biometrika), who show that it takes finite values, even in cases where the maximum likelihood estimate does not exist. Kosmidis and Firth (2021, Biometrika) also provide empirical evidence that the estimator has good bias properties in high-dimensional settings where the number of parameters grows asymptotically linearly but slower than the number of observations. We design and carry out a large-scale computer experiment covering a wide range of such high-dimensional settings and produce strong empirical evidence for a simple rescaling of the maximum Jeffreys' prior penalized likelihood estimator that delivers high accuracy in signal recovery in the presence of an intercept parameter. The rescaled estimator is effective even in cases where estimates from maximum likelihood and other recently proposed corrective methods based on approximate message passing do not exist.
翻译:Firth(1993, Biometrika)指出,在参数数量固定时,逻辑回归中基于最大Jeffreys先验惩罚似然估计量的渐近偏差随观测数量的平方递减,其收敛速率比最大似然估计的典型速率快一个阶数。该估计量在实际工作中的广泛应用得益于Kosmidis与Firth(2021, Biometrika)的研究结果——他们证明该估计量即使在最大似然估计不存在的情况下仍能取有限值。Kosmidis与Firth(2021, Biometrika)还通过实证表明,在参数数量渐近线性增长但慢于观测数量的高维场景中,该估计量具有良好的偏差性质。我们设计并实施了一项覆盖此类高维场景的规模化计算机实验,获得了强有力的实证证据:通过对最大Jeffreys先验惩罚似然估计量进行简单尺度变换,可在存在截距参数时实现高精度的信号恢复。该尺度变换后的估计量即使在最大似然估计及基于近似消息传递的其他最新矫正方法失效的情形下仍表现有效。