Estimating long-term causal effects based on short-term surrogates is a significant but challenging problem in many real-world applications, e.g., marketing and medicine. Despite its success in certain domains, most existing methods estimate causal effects in an idealistic and simplistic way - ignoring the causal structure among short-term outcomes and treating all of them as surrogates. However, such methods cannot be well applied to real-world scenarios, in which the partially observed surrogates are mixed with their proxies among short-term outcomes. To this end, we develop our flexible method, Laser, to estimate long-term causal effects in the more realistic situation that the surrogates are observed or have observed proxies.Given the indistinguishability between the surrogates and proxies, we utilize identifiable variational auto-encoder (iVAE) to recover the whole valid surrogates on all the surrogates candidates without the need of distinguishing the observed surrogates or the proxies of latent surrogates. With the help of the recovered surrogates, we further devise an unbiased estimation of long-term causal effects. Extensive experimental results on the real-world and semi-synthetic datasets demonstrate the effectiveness of our proposed method.
翻译:基于短期替代变量估计长期因果效应是许多实际应用(如市场营销和医学)中重要但具有挑战性的问题。尽管现有方法在某些领域取得了成功,但大多数方法以理想化和简单化的方式估计因果效应——忽略短期结果之间的因果结构,并将所有短期结果都视为替代变量。然而,这些方法无法很好地应用于真实场景,因为在真实场景中,部分观测到的替代变量与其在短期结果中的代理变量混合在一起。为此,我们开发了一种灵活的方法Laser,用于在更现实的情况下估计长期因果效应,即替代变量可观测或具有可观测的代理变量。鉴于替代变量与代理变量之间的不可区分性,我们利用可识别变分自编码器(iVAE)从所有替代候选变量中恢复出全部有效替代变量,而无需区分是观测到的替代变量还是潜在替代变量的代理变量。借助恢复后的替代变量,我们进一步设计了一种无偏的长期因果效应估计方法。在真实世界和半合成数据集上的大量实验结果表明了我们所提出方法的有效性。