The premise of identifiable and causal representation learning is to improve the current representation learning paradigm in terms of generalizability or robustness. Despite recent progress in questions of identifiability, more theoretical results demonstrating concrete advantages of these methods for downstream tasks are needed. In this paper, we consider the task of intervention extrapolation: predicting how interventions affect an outcome, even when those interventions are not observed at training time, and show that identifiable representations can provide an effective solution to this task even if the interventions affect the outcome non-linearly. Our setup includes an outcome Y, observed features X, which are generated as a non-linear transformation of latent features Z, and exogenous action variables A, which influence Z. The objective of intervention extrapolation is to predict how interventions on A that lie outside the training support of A affect Y. Here, extrapolation becomes possible if the effect of A on Z is linear and the residual when regressing Z on A has full support. As Z is latent, we combine the task of intervention extrapolation with identifiable representation learning, which we call Rep4Ex: we aim to map the observed features X into a subspace that allows for non-linear extrapolation in A. We show that the hidden representation is identifiable up to an affine transformation in Z-space, which is sufficient for intervention extrapolation. The identifiability is characterized by a novel constraint describing the linearity assumption of A on Z. Based on this insight, we propose a method that enforces the linear invariance constraint and can be combined with any type of autoencoder. We validate our theoretical findings through synthetic experiments and show that our approach succeeds in predicting the effects of unseen interventions.
翻译:可识别与因果表征学习的前提是通过提高泛化性或鲁棒性来改进当前的表征学习范式。尽管在可识别性问题上取得了最新进展,但仍需更多理论结果来证明这些方法在下游任务中的具体优势。本文研究了干预外推任务:预测干预如何影响结果,即使这些干预在训练时未被观测到,并证明可识别表征能为该任务提供有效方案,即使干预对结果的影响是非线性的。研究框架包含结果变量Y、观测特征X(由潜变量Z的非线性变换生成)、以及影响Z的外生动作变量A。干预外推的目标是预测训练支持域之外的A干预对Y的影响。当A对Z的影响呈线性且Z对A回归的残差具有全支持域时,外推成为可能。由于Z是潜变量,我们将干预外推任务与可识别表征学习相结合,提出Rep4Ex方法:旨在将观测特征X映射到允许在A上进行非线性外推的子空间。我们证明隐层表征在Z空间中具有仿射变换意义下的可辨识性,这足以支撑干预外推。该可辨识性由描述A对Z线性假设的新型约束刻画。基于此发现,我们提出一种强制执行线性不变性约束的方法,该方法可与任意类型的自编码器结合。通过合成实验验证理论结果,并证明该方法能成功预测未见干预的效果。