We analyze the split-sample robust inference (SSRI) methodology proposed by Chernozhukov, Demirer, Duflo, and Fernandez-Val (CDDF) for quantifying uncertainty in heterogeneous treatment effect estimation. While SSRI effectively accounts for randomness in data splitting, its computational cost can be prohibitive when combined with complex machine learning (ML) models. We present an alternative randomization inference (RI) approach that maintains SSRI's generality without requiring repeated data splitting. By leveraging cross-fitting and design-based inference, RI achieves valid confidence intervals while significantly reducing computational burden. We compare the two methods through simulation, demonstrating that RI retains statistical efficiency while being more practical for large-scale applications.
翻译:本文分析了Chernozhukov、Demirer、Duflo和Fernandez-Val(CDDF)提出的用于量化异质性治疗效应估计不确定性的分样本稳健推断(SSRI)方法。尽管SSRI能有效处理数据分割中的随机性,但其与复杂机器学习(ML)模型结合时的计算成本可能过高。我们提出了一种替代的随机化推断(RI)方法,该方法保持了SSRI的通用性,同时无需重复进行数据分割。通过利用交叉拟合和基于设计的推断,RI在显著降低计算负担的同时实现了有效的置信区间。我们通过仿真比较了两种方法,证明RI在保持统计效率的同时,更适用于大规模应用场景。