Clinical machine learning models are increasingly trained using large scale, multimodal foundation paradigms, yet deployment environments often differ systematically from the data generating settings used during training. Such shifts arise from heterogeneous measurement policies, documentation practices, and institutional workflows, leading to representation entanglement between physiologic signal and practice specific artifacts. In this work, we propose a practice invariant representation learning framework for multimodal clinical prediction. We model clinical observations as arising from latent physiologic factors and environment dependent processes, and introduce an objective that jointly optimizes predictive performance while suppressing environment predictive information in the learned embedding. Concretely, we combine supervised risk minimization with adversarial environment regularization and invariant risk penalties across hospitals. Across multiple longitudinal EHR prediction tasks and cross institution evaluations, our method improves out of distribution AUROC by up to 2 to 3 points relative to masked pretraining and standard supervised baselines, while maintaining in distribution performance and improving calibration. These results demonstrate that explicitly accounting for systematic distribution shift during representation learning yields more robust and transferable clinical models, highlighting the importance of structural invariance alongside architectural scale in healthcare AI.
翻译:临床机器学习模型越来越多地采用大规模多模态基础范式进行训练,然而部署环境往往与训练时的数据生成环境存在系统性差异。此类偏移源于异构的测量策略、记录实践和机构工作流程,导致生理信号与特定实践伪影之间的表征纠缠。本研究提出一种用于多模态临床预测的实践不变表征学习框架。我们将临床观测建模为源自潜在生理因素与环境依赖过程的产物,并引入一种目标函数,在优化预测性能的同时抑制所学嵌入中可预测环境的信息。具体而言,我们将监督风险最小化与对抗性环境正则化及跨医院不变风险惩罚相结合。在多个纵向电子健康记录预测任务和跨机构评估中,相较于掩码预训练和标准监督基线,我们的方法将分布外AUROC提升了2至3个百分点,同时保持了分布内性能并改善了校准度。这些结果表明,在表征学习过程中显式考虑系统性分布偏移能够产生更稳健、可迁移的临床模型,凸显了在医疗人工智能中结构不变性与架构规模同等重要。