The performance of predictive models in clinical settings often degrades when deployed in new hospitals due to distribution shifts. This paper presents a large-scale study of causality-inspired domain generalization on heterogeneous multi-center intensive care unit (ICU) data. We apply anchor regression and introduce anchor boosting, a novel, tree-based nonlinear extension, to a large dataset comprising 400,000 patients from nine distinct ICU databases. We find that anchor regularization yields improvements of out-of-distribution performance, particularly for the most dissimilar target domains. The methods appear robust to violations of theoretical assumptions, such as anchor exogeneity. Furthermore, we propose a novel conceptual framework to quantify the utility of large external data datasets. By evaluating performance as a function of available target-domain data, we identify three regimes: (i) a domain generalization regime, where only the external model should be used, (ii) a domain adaptation regime, where refitting the external model is optimal, and (iii) a data-rich regime, where external data provides no additional value.
翻译:在临床环境中,预测模型部署至新医院时,常因数据分布偏移而导致性能下降。本文针对异构多中心重症监护病房(ICU)数据,开展了一项大规模因果启发的领域泛化研究。我们将锚点回归应用于包含来自九个不同ICU数据库的40万名患者的大型数据集,并引入了锚点提升——一种新颖的、基于树的非线性扩展方法。研究发现,锚点正则化能够提升分布外性能,尤其对于最不相似的目标领域效果显著。该方法对理论假设(如锚点外生性)的违背表现出较强的鲁棒性。此外,我们提出了一个新颖的概念框架,用于量化大型外部数据集的效用。通过评估性能随可用目标领域数据量的变化关系,我们识别出三种机制:(i)领域泛化机制,此时仅应使用外部模型;(ii)领域适应机制,此时重新拟合外部模型为最优策略;(iii)数据充足机制,此时外部数据不提供额外价值。