Understanding causality should be a core requirement of any attempt to build real impact through AI. Due to the inherent unobservability of counterfactuals, large randomised trials (RCTs) are the standard for causal inference. But large experiments are generically expensive, and randomisation carries its own costs, e.g. when suboptimal decisions are trialed. Recent work has proposed more sample-efficient alternatives to RCTs, but these are not adaptable to the downstream application for which the causal effect is sought. In this work, we develop a task-specific approach to experimental design and derive sampling strategies customised to particular downstream applications. Across a range of important tasks, real-world datasets, and sample sizes, our method outperforms other benchmarks, e.g. requiring an order-of-magnitude less data to match RCT performance on targeted marketing tasks.
翻译:理解因果关系应成为任何通过人工智能产生实际影响尝试的核心要求。由于反事实的固有不可观测性,大规模随机对照试验(RCTs)是因果推断的标准方法。但大型实验通常成本高昂,且随机化本身也带来成本,例如试验次优决策时。近期研究提出了比RCT更节省样本的替代方案,但这些方案无法适应寻求因果效应的下游应用场景。本研究开发了一种任务特异性的实验设计方法,推导出针对特定下游应用定制的采样策略。在涵盖重要任务、真实世界数据集和不同样本量的广泛范围内,我们的方法优于其他基准,例如在定向营销任务中,仅需比RCT少一个数量级的数据即可达到同等性能。