Causal discovery and causal effect estimation are two fundamental tasks in causal inference. While many methods have been developed for each task individually, statistical challenges arise when applying these methods jointly: estimating causal effects after running causal discovery algorithms on the same data leads to "double dipping," invalidating the coverage guarantees of classical confidence intervals. To this end, we develop tools for valid post-causal-discovery inference. Across empirical studies, we show that a naive combination of causal discovery and subsequent inference algorithms leads to highly inflated miscoverage rates; on the other hand, applying our method provides reliable coverage while achieving more accurate causal discovery than data splitting.
翻译:因果发现与因果效应估计是因果推理中的两项基本任务。尽管针对每项任务已分别开发出众多方法,但将这两种方法联合应用时会遇到统计挑战:在同一数据上运行因果发现算法后估计因果效应会导致"双重挖掘",从而破坏经典置信区间的覆盖保证。为此,我们开发了用于因果发现后有效推断的工具。通过实证研究,我们表明:将因果发现与后续推断算法进行简单组合会导致覆盖错误率大幅升高;而应用我们的方法既能提供可靠的覆盖保证,又能比数据分割法实现更准确的因果发现。