Identifying the underlying time-delayed latent causal processes in sequential data is vital for grasping temporal dynamics and making downstream reasoning. While some recent methods can robustly identify these latent causal variables, they rely on strict assumptions about the invertible generation process from latent variables to observed data. However, these assumptions are often hard to satisfy in real-world applications containing information loss. For instance, the visual perception process translates a 3D space into 2D images, or the phenomenon of persistence of vision incorporates historical data into current perceptions. To address this challenge, we establish an identifiability theory that allows for the recovery of independent latent components even when they come from a nonlinear and non-invertible mix. Using this theory as a foundation, we propose a principled approach, CaRiNG, to learn the CAusal RepresentatIon of Non-invertible Generative temporal data with identifiability guarantees. Specifically, we utilize temporal context to recover lost latent information and apply the conditions in our theory to guide the training process. Through experiments conducted on synthetic datasets, we validate that our CaRiNG method reliably identifies the causal process, even when the generation process is non-invertible. Moreover, we demonstrate that our approach considerably improves temporal understanding and reasoning in practical applications.
翻译:识别序列数据中具有时间延迟的潜在因果过程对于理解时间动态和进行下游推理至关重要。尽管近期一些方法能够稳健地识别这些潜在因果变量,但它们依赖于从潜在变量到观测数据的可逆生成过程这一严格假设。然而,在包含信息损失的现实应用中,这些假设往往难以满足。例如,视觉感知过程将三维空间转换为二维图像,或视觉暂留现象将历史数据融入当前感知。为解决这一挑战,我们建立了一个可辨识性理论,允许在潜在成分来自非线性且非可逆混叠的情况下仍能恢复独立成分。基于该理论,我们提出了一种名为CaRiNG的原则性方法,用于学习非可逆生成时间数据的因果表征,且具有可辨识性保证。具体而言,我们利用时间上下文恢复丢失的潜在信息,并应用理论中的条件指导训练过程。通过在合成数据集上的实验,我们验证了CaRiNG方法能够在生成过程非可逆时可靠地识别因果过程。此外,我们证明该方法在实际应用中显著提升了时间理解与推理能力。