Domain generalization (DG) is the problem of generalizing from several distributions (or domains), for which labeled training data are available, to a new test domain for which no labeled data is available. For the prevailing benchmark datasets in DG, there exists a single classifier that performs well across all domains. In this work, we study a fundamentally different regime where the domains satisfy a \emph{posterior drift} assumption, in which the optimal classifier might vary substantially with domain. We establish a decision-theoretic framework for DG under posterior drift, and investigate the practical implications of this framework through experiments on language and vision tasks.
翻译:领域泛化(DG)旨在从多个具有标注训练数据的分布(或领域)中学习,以泛化至无标注数据的新测试领域。当前主流的领域泛化基准数据集中,通常存在一个在所有领域均表现良好的单一分类器。本研究探讨了一种本质上不同的机制:在满足后验漂移假设的领域中,最优分类器可能随领域发生显著变化。我们建立了后验漂移条件下领域泛化的决策理论框架,并通过语言与视觉任务的实验验证了该框架的实际意义。