Many scientific questions in biomedical, environmental, and psychological research involve understanding the impact of multiple factors on outcomes. While randomized factorial experiments are ideal for this purpose, randomization is infeasible in many empirical studies. Therefore, investigators often rely on observational data, where drawing reliable causal inferences for multiple factors remains challenging. As the number of treatment combinations grows exponentially with the number of factors, some treatment combinations can be rare or even missing by chance in observed data, further complicating factorial effects estimation. To address these challenges, we propose a novel weighting method tailored to observational studies with multiple factors. Our approach uses weighted observational data to emulate a randomized factorial experiment, enabling simultaneous estimation of the effects of multiple factors and their interactions. Our investigations reveal a crucial nuance: achieving balance among covariates, as in single-factor scenarios, is necessary but insufficient for unbiasedly estimating factorial effects. Our findings suggest that balancing the factors is also essential in multi-factor settings. Moreover, we extend our weighting method to handle missing treatment combinations in observed data. Finally, we study the asymptotic behavior of the new weighting estimators and propose a consistent variance estimator, providing reliable inferences on factorial effects in observational studies.
翻译:许多生物医学、环境及心理学研究中的科学问题涉及理解多个因素对结果的联合影响。虽然随机化因子实验是解决此类问题的理想方法,但在许多实证研究中随机化操作并不可行。因此,研究者常依赖观察性数据,而基于此类数据对多因素进行可靠因果推断仍面临挑战。随着处理组合数量随因素数量呈指数级增长,部分处理组合可能在观测数据中偶然出现罕见甚至缺失情况,进一步增加了因子效应估计的复杂性。针对这些问题,我们提出一种适用于多因素观察性研究的新型加权方法。该方法通过加权观测数据模拟随机化因子实验,能够同时估计多个因素及其交互作用的影响。研究发现一个关键细节:如同单因素场景,实现协变量间的平衡虽然必要,但不足以无偏估计因子效应。研究结果表明,在多因素场景中,实现因子间的平衡同样不可或缺。此外,我们将所提出的加权方法扩展至处理观测数据中缺失的处理组合情形。最后,我们研究了新加权估计量的渐近性质,并提出了一致性方差估计量,为观察性研究中的因子效应推断提供了可靠方法。