Interaction Testing in Variation Analysis

Relationships of cause and effect are of prime importance for explaining scientific phenomena. Often, rather than just understanding the effects of causes, researchers also wish to understand how a cause $X$ affects an outcome $Y$ mechanistically -- i.e., what are the causal pathways that are activated between $X$ and $Y$. For analyzing such questions, a range of methods has been developed over decades under the rubric of causal mediation analysis. Traditional mediation analysis focuses on decomposing the average treatment effect (ATE) into direct and indirect effects, and therefore focuses on the ATE as the central quantity. This corresponds to providing explanations for associations in the interventional regime, such as when the treatment $X$ is randomized. Commonly, however, it is of interest to explain associations in the observational regime, and not just in the interventional regime. In this paper, we introduce \text{variation analysis}, an extension of mediation analysis that focuses on the total variation (TV) measure between $X$ and $Y$, written as $\mathrm{E}[Y \mid X=x_1] - \mathrm{E}[Y \mid X=x_0]$. The TV measure encompasses both causal and confounded effects, as opposed to the ATE which only encompasses causal (direct and mediated) variations. In this way, the TV measure is suitable for providing explanations in the natural regime and answering questions such as ``why is $X$ associated with $Y$?''. Our focus is on decomposing the TV measure, in a way that explicitly includes direct, indirect, and confounded variations. Furthermore, we also decompose the TV measure to include interaction terms between these different pathways. Subsequently, interaction testing is introduced, involving hypothesis tests to determine if interaction terms are significantly different from zero. If interactions are not significant, more parsimonious decompositions of the TV measure can be used.

翻译：因果关系对于解释科学现象至关重要。研究者不仅希望理解原因的影响，还常常希望从机制上理解原因$X$如何影响结果$Y$——即$X$与$Y$之间激活了哪些因果路径。为分析此类问题，数十年来在因果中介分析的框架下已发展出一系列方法。传统中介分析侧重于将平均处理效应（ATE）分解为直接效应和间接效应，因此以ATE为核心度量。这对应于为干预机制下的关联提供解释，例如当处理$X$被随机分配时。然而，研究者通常更关注解释观测机制下的关联，而不仅限于干预机制。本文提出\text{变异分析}——一种中介分析的扩展方法，其关注点在于$X$与$Y$之间的总变异（TV）度量，记为$\mathrm{E}[Y \mid X=x_1] - \mathrm{E}[Y \mid X=x_0]$。与仅包含因果（直接与中介）变异的ATE不同，TV度量同时涵盖因果效应与混杂效应。因此，TV度量适用于为自然机制下的关联提供解释，并回答“为何$X$与$Y$存在关联？”这类问题。我们的研究重点在于分解TV度量，使其明确包含直接变异、间接变异及混杂变异。此外，我们还将TV度量分解为包含这些不同路径间交互作用项的形式。随后引入交互作用检验，通过假设检验判断交互作用项是否显著不为零。若交互作用不显著，则可使用更简约的TV度量分解形式。