Unsupervised Domain Adaptation (UDA) approaches address the covariate shift problem by minimizing the distribution discrepancy between the source and target domains, assuming that the label distribution is invariant across domains. However, in the imbalanced domain adaptation (IDA) scenario, covariate and long-tailed label shifts both exist across domains. To tackle the IDA problem, some current research focus on minimizing the distribution discrepancies of each corresponding class between source and target domains. Such methods rely much on the reliable pseudo labels' selection and the feature distributions estimation for target domain, and the minority classes with limited numbers makes the estimations more uncertainty, which influences the model's performance. In this paper, we propose a cross-domain class discrepancy minimization method based on accumulative class-centroids for IDA (centroIDA). Firstly, class-based re-sampling strategy is used to obtain an unbiased classifier on source domain. Secondly, the accumulative class-centroids alignment loss is proposed for iterative class-centroids alignment across domains. Finally, class-wise feature alignment loss is used to optimize the feature representation for a robust classification boundary. A series of experiments have proved that our method outperforms other SOTA methods on IDA problem, especially with the increasing degree of label shift.
翻译:无监督域适应方法通过最小化源域和目标域之间的分布差异来解决协变量偏移问题,前提是假设标签分布在域间不变。然而,在不平衡域适应场景中,协变量偏移和长尾标签偏移同时存在于域之间。为解决不平衡域适应问题,当前一些研究聚焦于最小化源域和目标域间每个对应类的分布差异。这类方法高度依赖于目标域中可靠伪标签的选择以及特征分布的估计,而数量有限的少数类使得这些估计更具不确定性,从而影响模型性能。本文提出了一种基于累积类质心的跨域类差异最小化方法,用于不平衡域适应。首先,采用基于类的重采样策略在源域上获得无偏分类器;其次,提出累积类质心对齐损失,用于跨域迭代对齐类质心;最后,利用类级特征对齐损失优化特征表示,以形成稳健的分类边界。一系列实验证明,我们的方法在不平衡域适应问题上优于其他当前最优方法,尤其是随着标签偏移程度的增加,优势更为显著。