Data-Driven Influence Functions for Optimization-Based Causal Inference

We study a constructive algorithm that approximates Gateaux derivatives for statistical functionals by finite differencing, with a focus on functionals that arise in causal inference. We study the case where probability distributions are not known a priori but need to be estimated from data. These estimated distributions lead to empirical Gateaux derivatives, and we study the relationships between empirical, numerical, and analytical Gateaux derivatives. Starting with a case study of the interventional mean (average potential outcome), we delineate the relationship between finite differences and the analytical Gateaux derivative. We then derive requirements on the rates of numerical approximation in perturbation and smoothing that preserve the statistical benefits of one-step adjustments, such as rate double robustness. We then study more complicated functionals such as dynamic treatment regimes, the linear-programming formulation for policy optimization in infinite-horizon Markov decision processes, and sensitivity analysis in causal inference. More broadly, we study optimization-based estimators, since this begets a class of estimands where identification via regression adjustment is straightforward but obtaining influence functions under minor variations thereof is not. The ability to approximate bias adjustments in the presence of arbitrary constraints illustrates the usefulness of constructive approaches for Gateaux derivatives. We also find that the statistical structure of the functional (rate double robustness) can permit less conservative rates for finite-difference approximation. This property, however, can be specific to particular functionals; e.g., it occurs for the average potential outcome (hence average treatment effect) but not the infinite-horizon MDP policy value.

翻译：本文研究一种通过有限差分近似统计泛函Gateaux导数的构造性算法，重点关注因果推断中出现的泛函类型。我们探讨概率分布未知而需从数据中估计的情形。这些估计分布会导出经验Gateaux导数，并系统研究经验导数、数值导数与解析Gateaux导数之间的理论关系。以干预均值（平均潜在结果）为案例，我们阐明有限差分与解析Gateaux导数的内在关联。进而推导扰动和平滑过程中数值逼近速率需满足的条件，以保持一步调整（如速率双重稳健性）的统计优势。随后研究更复杂的泛函：动态治疗策略、无限时域马尔可夫决策过程中策略优化的线性规划模型，以及因果推断中的敏感性分析。更广泛地，我们研究基于优化的估计量，这类估计量可通过回归调整直接识别，但其微小变体下的影响函数求解却非易事。在任意约束条件下近似偏差调整的能力，彰显了Gateaux导数构造性方法的实用价值。研究还发现泛函的统计结构（速率双重稳健性）可放宽有限差分逼近的保守速率要求，但该特性可能仅适用于特定泛函：例如平均潜在结果（即平均处理效应）具有此性质，而无限时域MDP策略值函数则不具备。