Successful unsupervised domain adaptation (UDA) is guaranteed only under strong assumptions such as covariate shift and overlap between input domains. The latter is often violated in high-dimensional applications such as image classification which, despite this challenge, continues to serve as inspiration and benchmark for algorithm development. In this work, we show that access to side information about examples from the source and target domains can help relax these assumptions and increase sample efficiency in learning, at the cost of collecting a richer variable set. We call this domain adaptation by learning using privileged information (DALUPI). Tailored for this task, we propose a simple two-stage learning algorithm inspired by our analysis and a practical end-to-end algorithm for multi-label image classification. In a suite of experiments, including an application to medical image analysis, we demonstrate that incorporating privileged information in learning can reduce errors in domain transfer compared to classical learning.
翻译:成功的无监督领域自适应(UDA)仅在强假设下(如协变量偏移和输入域重叠)才能得到保证。在高维应用(如图像分类)中,后者常被违背,然而这并未阻碍其作为算法开发的灵感来源与基准测试平台。本研究表明,获取源域与目标域样本的辅助信息有助于放宽这些假设,并在学习过程中提升样本效率,但代价是需要收集更丰富的变量集合。我们将其称为利用特权信息学习的领域自适应(DALUPI)。针对此任务,我们提出了一种基于理论分析的两阶段简单学习算法,以及一种适用于多标签图像分类的实用端到端算法。通过一系列实验(包括在医学图像分析中的应用),我们证明:相较于经典学习方法,在学习过程中融入特权信息可降低域迁移中的误差。