Closing the Domain Gap in Biomedical Imaging by In-Context Control Samples

The central problem in biomedical imaging are batch effects: systematic technical variations unrelated to the biological signal of interest. These batch effects critically undermine experimental reproducibility and are the primary cause of failure of deep learning systems on new experimental batches, preventing their practical use in the real world. Despite years of research, no method has succeeded in closing this performance gap for deep learning models. We propose Control-Stabilized Adaptive Risk Minimization via Batch Normalization (CS-ARM-BN), a meta-learning adaptation method that exploits negative control samples. Such unperturbed reference images are present in every experimental batch by design and serve as stable context for adaptation. We validate our novel method on Mechanism-of-Action (MoA) classification, a crucial task for drug discovery, on the large-scale JUMP-CP dataset. The accuracy of standard ResNets drops from 0.939 $\pm$ 0.005, on the training domain, to 0.862 $\pm$ 0.060 on data from new experimental batches. Foundation models, even after Typical Variation Normalization, fail to close this gap. We are the first to show that meta-learning approaches close the domain gap by achieving 0.935 $\pm$ 0.018. If the new experimental batches exhibit strong domain shifts, such as being generated in a different lab, meta-learning approaches can be stabilized with control samples, which are always available in biomedical experiments. Our work shows that batch effects in bioimaging data can be effectively neutralized through principled in-context adaptation, which also makes them practically usable and efficient.

翻译：生物医学成像的核心问题在于批次效应：与感兴趣生物信号无关的系统性技术变异。这些批次效应严重损害实验可重复性，并成为深度学习系统在新实验批次中失效的主要原因，阻碍了其在实际场景中的应用。尽管经过多年研究，尚无方法能有效缩小深度学习模型在这方面的性能差距。我们提出基于批量归一化的控制稳定自适应风险最小化（CS-ARM-BN），这是一种利用阴性对照样本的元学习适应方法。此类未受扰动的参考图像按实验设计存在于每个实验批次中，可作为稳定的适应情境。我们在大规模JUMP-CP数据集上，针对药物发现的关键任务——作用机制（MoA）分类，验证了本方法。标准ResNet的准确率从训练域的0.939±0.005下降至新实验批次的0.862±0.060。即使经过典型变异归一化，基础模型仍无法弥合这一差距。我们首次证明元学习方法可通过实现0.935±0.018的准确率来消除领域差异。当新实验批次存在显著领域偏移（如在不同实验室生成）时，元学习可通过生物医学实验中始终可用的对照样本获得稳定性。本研究表明，通过基于情境的自适应原则可有效中和生物成像数据中的批次效应，从而提升其实际可用性与效率。