Conditional Average Treatment Effects (CATE) estimation is one of the main challenges in causal inference with observational data. In addition to Machine Learning based-models, nonparametric estimators called meta-learners have been developed to estimate the CATE with the main advantage of not restraining the estimation to a specific supervised learning method. This task becomes, however, more complicated when the treatment is not binary as some limitations of the naive extensions emerge. This paper looks into meta-learners for estimating the heterogeneous effects of multi-valued treatments. We consider different meta-learners, and we carry out a theoretical analysis of their error upper bounds as functions of important parameters such as the number of treatment levels, showing that the naive extensions do not always provide satisfactory results. We introduce and discuss meta-learners that perform well as the number of treatments increases. We empirically confirm the strengths and weaknesses of those methods with synthetic and semi-synthetic datasets.
翻译:条件平均处理效应(CATE)估计是使用观测数据进行因果推断的主要挑战之一。除了基于机器学习的方法外,研究者还开发了称为元学习器的非参数估计器来估计CATE,其主要优势在于不限制使用特定的监督学习方法。然而,当处理变量并非二元时,由于朴素扩展方法存在局限性,这一任务变得更加复杂。本文研究了用于估计多值处理异质性效应的元学习器。我们考虑了不同的元学习器,并对其误差上界作为处理水平数量等重要参数的函数进行了理论分析,结果表明朴素扩展方法并不总能提供令人满意的结果。我们引入并讨论了在处理数量增加时表现良好的元学习器。通过合成数据集和半合成数据集,我们实证验证了这些方法的优缺点。