The premise of identifiable and causal representation learning is to improve the current representation learning paradigm in terms of generalizability or robustness. Despite recent progress in questions of identifiability, more theoretical results demonstrating concrete advantages of these methods for downstream tasks are needed. In this paper, we consider the task of intervention extrapolation: predicting how interventions affect an outcome, even when those interventions are not observed at training time, and show that identifiable representations can provide an effective solution to this task even if the interventions affect the outcome non-linearly. Our setup includes an outcome Y, observed features X, which are generated as a non-linear transformation of latent features Z, and exogenous action variables A, which influence Z. The objective of intervention extrapolation is to predict how interventions on A that lie outside the training support of A affect Y. Here, extrapolation becomes possible if the effect of A on Z is linear and the residual when regressing Z on A has full support. As Z is latent, we combine the task of intervention extrapolation with identifiable representation learning, which we call Rep4Ex: we aim to map the observed features X into a subspace that allows for non-linear extrapolation in A. We show using Wiener's Tauberian theorem that the hidden representation is identifiable up to an affine transformation in Z-space, which is sufficient for intervention extrapolation. The identifiability is characterized by a novel constraint describing the linearity assumption of A on Z. Based on this insight, we propose a method that enforces the linear invariance constraint and can be combined with any type of autoencoder. We validate our theoretical findings through synthetic experiments and show that our approach succeeds in predicting the effects of unseen interventions.
翻译:可识别性与因果表征学习的初衷旨在提升当前表征学习范式在泛化性或鲁棒性方面的表现。尽管近年来在可识别性问题研究上取得了进展,仍需更多理论成果论证这些方法在下游任务中的具体优势。本文考虑干预外推任务:预测干预措施如何影响结果变量,即使这些干预在训练阶段未被观测到。研究表明,可识别表征能为该任务提供有效解决方案,即便干预措施对结果的影响呈现非线性特征。我们建立的框架包含结果变量Y、由潜在特征Z经非线性变换生成的观测特征X、以及影响Z的外生动作变量A。干预外推的目标是:预测超出训练集支持的干预措施A对Y的影响。当A对Z的影响为线性,且以A回归Z的残差具备全支撑时,外推成为可能。由于Z为潜变量,我们将干预外推任务与可识别表征学习相结合,提出Rep4Ex方法:旨在将观测特征X映射至允许对A进行非线性外推的子空间。通过Wiener Tauber定理,我们证明隐层表征在Z空间内可识别至仿射变换精度,该精度足以支持干预外推。这种可识别性由描述A对Z线性假设的新型约束条件刻画。基于此发现,我们提出一种强制实施线性不变性约束的方法,该方法可适配任意类型的自编码器。通过合成实验验证理论结果,结果表明我们的方法能成功预测未见干预措施的效果。