In many causal inference settings, the treatment of interest is not directly observed; instead, one or more error-prone proxy measurements are available, creating a fundamental identification challenge. Building on identification methods for hidden treatments with proxies, we develop a general semiparametric framework for causal effect estimation that accommodates several common estimands. Hidden treatment models differ fundamentally from classical missing-data settings, where semiparametric theory relies on positivity requiring a nonzero probability of observing a complete case for each treatment value. Here this assumption fails by design because the true treatment is never observed, creating new challenges for semiparametric characterization and efficient estimation. To overcome these challenges, we develop a new semiparametric characterization for hidden treatment models by deriving a formal mapping between the orthogonal complement to the nuisance tangent space, which contains all influence functions for a causal functional in the oracle full-data model, and its counterpart in the observed hidden treatment model. This mapping gives closed-form observed-data influence functions and yields the semiparametric efficiency bound. It also leads to semiparametric efficient estimators with a new form of multiple robustness or mixed bias property, enabling inference with nonparametric nuisance estimators. A further challenge is that some nuisance functions depend on the hidden treatment, preventing direct use of standard nonparametric regression methods. We introduce an iterative estimation algorithm and establish its large-sample properties. Simulations demonstrate the finite-sample performance of the proposed estimators, and an application estimates the causal effect of Alzheimer's disease on hippocampal volume using data from the Alzheimer's Disease Neuroimaging Initiative.
翻译:暂无翻译