Adjusting for latent covariates is crucial for estimating causal effects from observational textual data. Most existing methods only account for confounding covariates that affect both treatment and outcome, potentially leading to biased causal effects. This bias arises from insufficient consideration of non-confounding covariates, which are relevant only to either the treatment or the outcome. In this work, we aim to mitigate the bias by unveiling interactions between different variables to disentangle the non-confounding covariates when estimating causal effects from text. The disentangling process ensures covariates only contribute to their respective objectives, enabling independence between variables. Additionally, we impose a constraint to balance representations from the treatment group and control group to alleviate selection bias. We conduct experiments on two different treatment factors under various scenarios, and the proposed model significantly outperforms recent strong baselines. Furthermore, our thorough analysis on earnings call transcripts demonstrates that our model can effectively disentangle the variables, and further investigations into real-world scenarios provide guidance for investors to make informed decisions.
翻译:调整潜在协变量对于从观察性文本数据中估计因果效应至关重要。现有大多数方法仅考虑同时影响处理变量和结果变量的混杂协变量,这可能导致有偏的因果效应。这种偏差源于对非混杂协变量的考虑不足,这些协变量仅与处理变量或结果变量中的一方相关。在本研究中,我们旨在通过揭示不同变量间的交互作用,在从文本中估计因果效应时解耦非混杂协变量,从而缓解偏差。解耦过程确保协变量仅对其相应目标做出贡献,从而实现变量间的独立性。此外,我们施加了一个约束条件,以平衡处理组和对照组的表示,减轻选择偏差。我们在不同场景下针对两种不同处理因子进行了实验,所提出的模型显著优于近期强基线模型。进一步,我们对财报电话会议记录的深入分析表明,该模型能够有效解耦变量,而对真实场景的进一步研究为投资者做出明智决策提供了指导。