Conditional Average Treatment Effects (CATE) estimation is one of the main challenges in causal inference with observational data. In addition to Machine Learning based-models, nonparametric estimators called meta-learners have been developed to estimate the CATE with the main advantage of not restraining the estimation to a specific supervised learning method. This task becomes, however, more complicated when the treatment is not binary as some limitations of the naive extensions emerge. This paper looks into meta-learners for estimating the heterogeneous effects of multi-valued treatments. We consider different meta-learners, and we carry out a theoretical analysis of their error upper bounds as functions of important parameters such as the number of treatment levels, showing that the naive extensions do not always provide satisfactory results. We introduce and discuss meta-learners that perform well as the number of treatments increases. We empirically confirm the strengths and weaknesses of those methods with synthetic and semi-synthetic datasets.
翻译:条件平均处理效应(CATE)估计是利用观测数据进行因果推断的主要挑战之一。除了基于机器学习的方法外,研究者还开发了称为元学习器的非参数估计器来估计CATE,其核心优势在于不局限于特定的监督学习方法。然而,当处理变量为非二元时,这一任务会变得更加复杂,因为朴素扩展方法存在局限性。本文深入研究了用于估计多值处理异质性效应的元学习器。我们考虑了不同的元学习器,并从理论上分析了它们误差上界与处理水平数量等重要参数的关系,证明朴素扩展方法并非总能提供令人满意的结果。我们提出并讨论了在处理水平数量增加时表现良好的元学习器,并通过合成数据集与半合成数据集实证验证了这些方法的优势与不足。