Conventional supervised learning methods typically assume i.i.d samples and are found to be sensitive to out-of-distribution (OOD) data. We propose Generative Causal Representation Learning (GCRL) which leverages causality to facilitate knowledge transfer under distribution shifts. While we evaluate the effectiveness of our proposed method in human trajectory prediction models, GCRL can be applied to other domains as well. First, we propose a novel causal model that explains the generative factors in motion forecasting datasets using features that are common across all environments and with features that are specific to each environment. Selection variables are used to determine which parts of the model can be directly transferred to a new environment without fine-tuning. Second, we propose an end-to-end variational learning paradigm to learn the causal mechanisms that generate observations from features. GCRL is supported by strong theoretical results that imply identifiability of the causal model under certain assumptions. Experimental results on synthetic and real-world motion forecasting datasets show the robustness and effectiveness of our proposed method for knowledge transfer under zero-shot and low-shot settings by substantially outperforming the prior motion forecasting models on out-of-distribution prediction. Our code is available at https://github.com/sshirahmad/GCRL.
翻译:传统监督学习方法通常假设样本独立同分布,但被发现在面对分布外数据时表现敏感。我们提出生成式因果表示学习框架,利用因果性促进分布偏移下的知识迁移。虽然我们在人类轨迹预测模型中评估了所提方法的有效性,但GCRL同样可应用于其他领域。首先,我们提出一种新型因果模型,通过跨环境共享特征和特定环境特征解释运动预测数据集中的生成因子,并采用选择变量判定模型哪些部分可直接迁移至新环境而无需微调。其次,我们提出端到端变分学习范式,学习从特征生成观测数据的因果机制。GCRL获得强理论结果支撑,表明在特定假设下因果模型具有可识别性。在合成和真实运动预测数据集上的实验结果表明,本文方法在零样本和低样本场景下通过显著优于先前的运动预测模型,展现出分布外预测中知识迁移的鲁棒性和有效性。我们的代码开源在https://github.com/sshirahmad/GCRL。