Testing the homogeneity of two distributions is fundamental in statistics, but classical procedures may fail under nonignorable nonresponse. In many surveys, callback data record repeated contact attempts and provide auxiliary information about the response mechanism. We develop a semiparametric framework for two-sample homogeneity testing that explicitly incorporates such information. The response mechanism is modeled by a flexible semiparametric callback model, while the two population distributions are linked through a density ratio model. Within this unified framework, we propose an empirical likelihood ratio test for distributional homogeneity and show that, under the null hypothesis, it has a Wilks-type chi-square limit. To facilitate computation, we develop an efficient expectation-maximization-type algorithm. Simulation results show that the proposed method controls type I error well and achieves substantially higher power than existing methods that ignore nonignorable missingness. An application to real survey income data illustrates its practical value.
翻译:两分布齐性检验是统计学中的基础问题,但在非可忽略无应答情况下,经典检验方法可能失效。许多调查中,回调数据记录了重复联系尝试,为应答机制提供了辅助信息。我们建立了一个半参数框架,专门用于两样本齐性检验,该框架明确整合了这类信息。其中,应答机制通过灵活的半参数回调模型进行建模,而两个总体分布则通过密度比模型建立关联。在此统一框架下,我们提出了基于经验似然比的分布齐性检验,并证明在原假设下其具有Wilks型卡方极限。为便于计算,我们开发了一种高效的期望最大化类型算法。模拟结果表明,所提方法能良好控制第一类错误,且检验功效显著高于忽略非可忽略缺失的现有方法。通过实际调查收入数据的应用,进一步验证了其实用价值。