Limits of Approximating the Median Treatment Effect

Average Treatment Effect (ATE) estimation is a well-studied problem in causal inference. However, it does not necessarily capture the heterogeneity in the data, and several approaches have been proposed to tackle the issue, including estimating the Quantile Treatment Effects. In the finite population setting containing $n$ individuals, with treatment and control values denoted by the potential outcome vectors $\mathbf{a}, \mathbf{b}$, much of the prior work focused on estimating median$(\mathbf{a}) -$ median$(\mathbf{b})$, where median($\mathbf x$) denotes the median value in the sorted ordering of all the values in vector $\mathbf x$. It is known that estimating the difference of medians is easier than the desired estimand of median$(\mathbf{a-b})$, called the Median Treatment Effect (MTE). The fundamental problem of causal inference -- for every individual $i$, we can only observe one of the potential outcome values, i.e., either the value $a_i$ or $b_i$, but not both, makes estimating MTE particularly challenging. In this work, we argue that MTE is not estimable and detail a novel notion of approximation that relies on the sorted order of the values in $\mathbf{a-b}$. Next, we identify a quantity called variability that exactly captures the complexity of MTE estimation. By drawing connections to instance-optimality studied in theoretical computer science, we show that every algorithm for estimating the MTE obtains an approximation error that is no better than the error of an algorithm that computes variability. Finally, we provide a simple linear time algorithm for computing the variability exactly. Unlike much prior work, a particular highlight of our work is that we make no assumptions about how the potential outcome vectors are generated or how they are correlated, except that the potential outcome values are $k$-ary, i.e., take one of $k$ discrete values.

翻译：平均处理效应（ATE）估计是因果推断中研究较为充分的问题。然而，ATE无法完全反映数据中的异质性，为此学界提出了多种解决方案，包括分位数处理效应估计。在包含$n$个个体的有限总体场景中，以潜在结果向量$\mathbf{a}, \mathbf{b}$分别表示处理组和对照组值，以往研究多聚焦于估计median$(\mathbf{a}) -$ median$(\mathbf{b})$，其中median($\mathbf x$)表示向量$\mathbf{x}$中所有值排序后的中位数。已知中位数之差估计比目标估计量median$(\mathbf{a-b})$（即中位数处理效应，MTE）更易实现。因果推断的根本难题在于对每个个体$i$仅能观测一个潜在结果（即$a_i$或$b_i$之其一而无法兼得），这使得MTE估计尤为困难。本文论证了MTE不可估性，并提出一种依赖于$\mathbf{a-b}$值排序顺序的新型近似概念。进一步，我们识别出名为变异度的量，其精确刻画了MTE估计的复杂度。通过与理论计算机科学中的实例最优性建立联系，我们证明：所有MTE估计算法所达到的近似误差不会优于计算变异度算法的误差。最后，我们提出一种简单线性时间算法以精确计算变异度。与多数先前研究不同，本文的突出优势在于：除假设潜在结果值为$k$元（即取$k$个离散值之一）外，我们未对潜在结果向量的生成机制或相关性作任何假设。