We introduce a new convexified matching method for missing value imputation and individualized inference inspired by computational optimal transport. Our method integrates favorable features from mainstream imputation approaches: optimal matching, regression imputation, and synthetic control. We impute counterfactual outcomes based on convex combinations of observed outcomes, defined based on an optimal coupling between the treated and control data sets. The optimal coupling problem is considered a convex relaxation to the combinatorial optimal matching problem. We estimate granular-level individual treatment effects while maintaining a desirable aggregate-level summary by properly constraining the coupling. We construct transparent, individual confidence intervals for the estimated counterfactual outcomes. We devise fast iterative entropic-regularized algorithms to solve the optimal coupling problem that scales favorably when the number of units to match is large. Entropic regularization plays a crucial role in both inference and computation; it helps control the width of the individual confidence intervals and design fast optimization algorithms.
翻译:本文提出了一种受计算最优传输启发的凸化匹配新方法,用于缺失值插补与个体化推断。该方法融合了主流插补方法的优势特征:最优匹配、回归插补与合成控制。我们基于观测结果的凸组合来估算反事实结果,该凸组合通过处理组与对照组数据集之间的最优耦合定义。最优耦合问题被视为组合最优匹配问题的凸松弛解。在适当约束耦合的条件下,我们能在保持理想聚合层面统计量的同时,估计细粒度层面的个体处理效应。我们为估计的反事实结果构建了透明化的个体置信区间。针对最优耦合问题,我们设计了快速迭代熵正则化算法,该算法在待匹配单元数量较大时具有良好的可扩展性。熵正则化在推断与计算中均发挥关键作用:它有助于控制个体置信区间的宽度,并支持快速优化算法的设计。