This paper develops semiparametric methods for estimation and inference of widely used inequality measures when survey data are subject to nonignorable nonresponse, a challenging setting in which response probabilities depend on the unobserved outcomes. Such nonresponse mechanisms are common in household surveys and invalidate standard inference procedures due to selection bias and lack of population representativeness. We address this problem by exploiting callback data from repeated contact attempts and adopting a semiparametric model that leaves the outcome distribution unspecified. We construct semiparametric full-likelihood estimators for the underlying distribution and the associated inequality measures, and establish their large-sample properties for a broad class of functionals, including quantiles, the Theil index, and the Gini index. Explicit asymptotic variance expressions are derived, enabling valid Wald-type inference under nonignorable nonresponse. To facilitate implementation, we propose a stable and computationally convenient expectation-maximization algorithm, whose steps either admit closed-form expressions or reduce to fitting a standard logistic regression model. Simulation studies demonstrate that the proposed procedures effectively correct nonresponse bias and achieve near-benchmark efficiency. An application to Consumer Expenditure Survey data illustrates the practical gains from incorporating callback information when making inference on inequality measures.
翻译:本文针对调查数据存在不可忽略无应答(即应答概率依赖于未观测结果)这一挑战性场景,提出了广泛使用的不平等指标的半参数估计与推断方法。此类无应答机制在家庭调查中普遍存在,会因选择偏差和缺乏总体代表性而使标准推断程序失效。我们通过利用重复接触尝试产生的回访数据,并采用不设定结果分布形式的半参数模型来解决该问题。我们构建了针对基础分布及相关不平等指标(包括分位数、泰尔指数和基尼指数等广泛函数类)的半参数全似然估计量,并建立了其大样本性质。推导出的显式渐近方差表达式使得在不可忽略无应答条件下能够进行有效的沃尔德型推断。为便于实施,我们提出了一种稳定且计算便捷的期望最大化算法,其迭代步骤或具有闭式解,或可简化为拟合标准逻辑回归模型。模拟研究表明,所提方法能有效校正无应答偏差,并达到接近基准的效率水平。通过对消费者支出调查数据的应用分析,展示了在推断不平等指标时纳入回访信息所能带来的实际增益。