Causal discovery and causal effect estimation are two fundamental tasks in causal inference. While many methods have been developed for each task individually, statistical challenges arise when applying these methods jointly: estimating causal effects after running causal discovery algorithms on the same data leads to "double dipping," invalidating the coverage guarantees of classical confidence intervals. To this end, we develop tools for valid post-causal-discovery inference. Across empirical studies, we show that a naive combination of causal discovery and subsequent inference algorithms leads to highly inflated miscoverage rates; on the other hand, applying our method provides reliable coverage while achieving more accurate causal discovery than data splitting.
翻译:因果发现与因果效应估计是因果推断中的两个基本任务。尽管已针对每个任务分别开发了许多方法,但在联合应用这些方法时会出现统计挑战:在相同数据上运行因果发现算法后估计因果效应会导致"双重使用数据",从而使经典置信区间的覆盖保证失效。为此,我们开发了用于因果发现后有效推断的工具。多项实证研究表明,将因果发现与后续推断算法进行简单组合会导致错误覆盖率急剧膨胀;而应用我们的方法不仅能提供可靠的覆盖保证,还能获得比数据分割方法更准确的因果发现结果。