A major barrier to deploying current machine learning models lies in their non-reliability to dataset shifts. To resolve this problem, most existing studies attempted to transfer stable information to unseen environments. Particularly, independent causal mechanisms-based methods proposed to remove mutable causal mechanisms via the do-operator. Compared to previous methods, the obtained stable predictors are more effective in identifying stable information. However, a key question remains: which subset of this whole stable information should the model transfer, in order to achieve optimal generalization ability? To answer this question, we present a comprehensive minimax analysis from a causal perspective. Specifically, we first provide a graphical condition for the whole stable set to be optimal. When this condition fails, we surprisingly find with an example that this whole stable set, although can fully exploit stable information, is not the optimal one to transfer. To identify the optimal subset under this case, we propose to estimate the worst-case risk with a novel optimization scheme over the intervention functions on mutable causal mechanisms. We then propose an efficient algorithm to search for the subset with minimal worst-case risk, based on a newly defined equivalence relation between stable subsets. Compared to the exponential cost of exhaustively searching over all subsets, our searching strategy enjoys a polynomial complexity. The effectiveness and efficiency of our methods are demonstrated on synthetic data and the diagnosis of Alzheimer's disease.
翻译:当前机器学习模型部署的主要障碍在于其对数据集偏移的非可靠性。为解决此问题,现有研究大多尝试将稳定信息传递至未见环境。特别是,基于独立因果机制的方法提出通过do算子移除可变因果机制。与先前方法相比,所获得的稳定预测器在识别稳定信息方面更为有效。然而,一个关键问题仍然存在:为实现最优泛化能力,模型应从全部稳定信息中传递哪个子集?为回答此问题,我们从因果视角提出了一种全面的极小极大分析。具体而言,我们首先给出了全体稳定集为最优的图条件。当该条件不成立时,我们通过一个示例惊奇地发现,虽然全体稳定集能充分利用稳定信息,但它并非最优传递集。为在该情形下识别最优子集,我们提出通过对可变因果机制上的干预函数采用新颖优化方案来估计最坏情况风险。随后,基于新定义的稳定子集间等价关系,我们提出一种高效算法以搜索具有最小最坏情况风险的子集。与对所有子集穷举搜索的指数级成本相比,我们的搜索策略享有多项式复杂度。在合成数据及阿尔茨海默病诊断上的实验证明了我们方法的有效性与高效性。