In empirical research, when we have multiple estimators for the same parameter of interest, a central question arises: how do we combine unbiased but less precise estimators with biased but more precise ones to improve the inference? Under this setting, the point estimation problem has attracted considerable attention. In this paper, we focus on a less studied inference question: how can we conduct valid statistical inference in such settings with unknown bias? We propose a strategy to combine unbiased and biased estimators from a sensitivity analysis perspective. We derive a sequence of confidence intervals indexed by the magnitude of the bias, which enable researchers to assess how conclusions vary with the bias levels. Importantly, we introduce the notion of the b-value, a critical value of the unknown maximum relative bias at which combining estimators does not yield a significant result. We apply this strategy to three canonical combined estimators: the precision-weighted estimator, the pretest estimator, and the soft-thresholding estimator. For each estimator, we characterize the sequence of confidence intervals and determine the bias threshold at which the conclusion changes. Based on the theory, we recommend reporting the b-value based on the soft-thresholding estimator and its associated confidence intervals, which are robust to unknown bias and achieve the lowest worst-case risk among the alternatives.
翻译:在实证研究中,当针对同一目标参数存在多个估计量时,一个核心问题随之产生:如何将无偏但精度较低的估计量与有偏但精度较高的估计量相结合以改进统计推断?在此背景下,点估计问题已受到广泛关注。本文聚焦于一个较少被探讨的推断问题:在偏倚未知的情况下如何进行有效的统计推断?我们从敏感性分析的角度提出一种结合无偏与有偏估计量的策略。通过推导以偏倚幅度为索引的置信区间序列,研究者能够评估结论如何随偏倚水平变化。特别重要的是,我们提出了b值的概念——这是未知最大相对偏倚的临界值,当偏倚达到该值时,估计量的结合将无法产生显著结果。我们将该策略应用于三种经典组合估计量:精度加权估计量、预检验估计量和软阈值估计量。针对每种估计量,我们刻画了置信区间序列的特征,并确定了结论发生改变的偏倚阈值。基于理论分析,我们建议报告基于软阈值估计量的b值及其相应置信区间,该方法对未知偏倚具有稳健性,并在备选方案中实现了最低的最坏情况风险。