Design-based inference, also known as randomization-based or finite-population inference, provides a principled framework for trustworthy statistical inference by attributing randomness solely to the design mechanism (e.g., treatment assignment, survey sampling, or missingness), without imposing distributional or modeling assumptions on outcome data. Despite its conceptual appeal and long history, applying this framework becomes challenging when the underlying design probabilities (i.e., propensity scores) are unknown, as is common in observational studies, real-world surveys, and missing-data settings. Existing plug-in and matching-based methods either ignore uncertainty from propensity score estimation or rely on near-exact covariate matching, often leading to systematic under-coverage, while existing finite-population M-estimation approaches remain largely restricted to parametric propensity score models. In this work, we propose propensity score propagation, a general framework for valid design-based inference with unknown propensity scores. The framework introduces a regeneration-and-union procedure to propagate uncertainty from propensity score estimation into downstream design-based inference. It accommodates both parametric and nonparametric propensity score models, integrates seamlessly with existing design-based inference methods developed under known propensity scores, and applies broadly across design-based inference problems. Theoretical and simulation studies show that the proposed framework achieves nominal coverage, even in settings where conventional approaches exhibit substantial under-coverage.
翻译:暂无翻译