Bridging Binarization: Causal Inference with Dichotomized Continuous Exposures

The average treatment effect (ATE) is a common parameter estimated in causal inference literature, but it is only defined for binary treatments. Thus, despite concerns raised by some researchers, many studies seeking to estimate the causal effect of a continuous treatment create a new binary treatment variable by dichotomizing the continuous values into two categories. In this paper, we affirm binarization as a statistically valid method for answering causal questions about continuous treatments by showing the equivalence between the binarized ATE and the difference in the average outcomes of two specific modified treatment policies. These policies impose cut-offs corresponding to the binarized treatment variable and assume preservation of relative self-selection. Relative self-selection is the ratio of the probability density of an individual having an exposure equal to one value of the continuous treatment variable versus another. The policies assume that, for any two values of the treatment variable with non-zero probability density after the cut-off, this ratio will remain unchanged. Through this equivalence, we clarify the assumptions underlying binarization and discuss how to properly interpret the resulting estimator. Additionally, we introduce a new target parameter that can be computed after binarization that considers the status-quo world. We argue that this parameter addresses more relevant causal questions than the traditional binarized ATE parameter. Finally, we present a simulation study to illustrate the implications of these assumptions when analyzing data and to demonstrate how to correctly implement estimators of the parameters discussed.

翻译：平均处理效应（ATE）是因果推断文献中常被估计的参数，但其仅适用于二元处理。因此，尽管一些研究者提出了担忧，许多旨在估计连续处理因果效应的研究仍通过将连续值二分为两个类别来创建新的二元处理变量。本文通过证明二值化ATE与两种特定修正处理策略下平均结果差异之间的等价性，肯定了二值化作为回答连续处理因果问题的统计有效方法。这些策略施加了与二值化处理变量相对应的截断点，并假设相对自选择得以保持。相对自选择是指个体暴露于连续处理变量某一值相对于另一值的概率密度之比。该策略假设，对于截断后具有非零概率密度的任意两个处理变量值，该比值将保持不变。通过这种等价性，我们阐明了二值化背后的假设，并讨论了如何正确解释所得估计量。此外，我们引入了一个可在二值化后计算的新目标参数，该参数考虑了现状世界。我们认为，该参数比传统的二值化ATE参数更能解决相关的因果问题。最后，我们通过模拟研究说明了这些假设在数据分析中的影响，并展示了如何正确实现所讨论参数的估计量。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日