Successful unsupervised domain adaptation (UDA) is guaranteed only under strong assumptions such as covariate shift and overlap between input domains. The latter is often violated in high-dimensional applications such as image classification which, despite this challenge, continues to serve as inspiration and benchmark for algorithm development. In this work, we show that access to side information about examples from the source and target domains can help relax these assumptions and increase sample efficiency in learning, at the cost of collecting a richer variable set. We call this domain adaptation by learning using privileged information (DALUPI). Tailored for this task, we propose a simple two-stage learning algorithm inspired by our analysis and a practical end-to-end algorithm for multi-label image classification. In a suite of experiments, including an application to medical image analysis, we demonstrate that incorporating privileged information in learning can reduce errors in domain transfer compared to classical learning.
翻译:无监督域适应(UDA)仅在协变量偏移和输入域重叠等强假设下才能保证成功。在图像分类等高维应用中,后者往往难以满足,尽管存在这一挑战,这些应用仍持续作为算法开发的灵感来源和基准测试。本研究表明,获取源域和目标域样本的辅助信息有助于放宽这些假设,并在学习过程中提高样本效率,代价是收集更丰富的变量集。我们将此方法称为基于特权信息学习的域适应(DALUPI)。针对这一任务,我们受理论分析启发提出了一种简单的两阶段学习算法,并设计了一种适用于多标签图像分类的实用端到端算法。在包含医学图像分析应用的一系列实验中,我们证明与经典学习相比,在学习中融入特权信息能够减少域迁移中的误差。