In unsupervised causal representation learning for sequential data with time-delayed latent causal influences, strong identifiability results for the disentanglement of causally-related latent variables have been established in stationary settings by leveraging temporal structure. However, in nonstationary setting, existing work only partially addressed the problem by either utilizing observed auxiliary variables (e.g., class labels and/or domain indexes) as side information or assuming simplified latent causal dynamics. Both constrain the method to a limited range of scenarios. In this study, we further explored the Markov Assumption under time-delayed causally related process in nonstationary setting and showed that under mild conditions, the independent latent components can be recovered from their nonlinear mixture up to a permutation and a component-wise transformation, without the observation of auxiliary variables. We then introduce NCTRL, a principled estimation framework, to reconstruct time-delayed latent causal variables and identify their relations from measured sequential data only. Empirical evaluations demonstrated the reliable identification of time-delayed latent causal influences, with our methodology substantially outperforming existing baselines that fail to exploit the nonstationarity adequately and then, consequently, cannot distinguish distribution shifts.
翻译:在无监督因果表示学习中,针对具有时滞潜在因果影响的序列数据,现有研究已在平稳环境下通过利用时间结构建立了强可辨识性结果,用于解构因果关联的潜在变量。然而,在非平稳环境下,现有工作仅部分解决了该问题:要么利用观测到的辅助变量(如类别标签和/或领域索引)作为侧信息,要么假设简化的潜在因果动力学。这两种方法均将模型局限于有限的应用场景。本研究进一步探讨了非平稳环境下时滞因果过程中的马尔可夫假设,并证明在温和条件下,无需观测辅助变量即可通过非线性混合恢复独立潜在分量(仅存在排列和分量级变换不确定性)。我们随后提出NCTRL——一个基于原则的估计框架——仅通过观测到的时序数据重建时滞潜在因果变量并识别其关系。实证评估表明,我们的方法能够可靠识别时滞潜在因果影响,显著优于因未能充分挖掘非平稳性而无法区分分布偏移的现有基线方法。