Source-free domain adaptation (SFDA), where only a pre-trained source model is used to adapt to the target distribution, is a more general approach to achieving domain adaptation in the real world. However, it can be challenging to capture the inherent structure of the target features accurately due to the lack of supervised information on the target domain. By analyzing the clustering performance of the target features, we show that they still contain core features related to discriminative attributes but lack the collation of semantic information. Inspired by this insight, we present Chaos to Order (CtO), a novel approach for SFDA that strives to constrain semantic credibility and propagate label information among target subpopulations. CtO divides the target data into inner and outlier samples based on the adaptive threshold of the learning state, customizing the learning strategy to fit the data properties best. Specifically, inner samples are utilized for learning intra-class structure thanks to their relatively well-clustered properties. The low-density outlier samples are regularized by input consistency to achieve high accuracy with respect to the ground truth labels. In CtO, by employing different learning strategies to propagate the labels from the inner local to outlier instances, it clusters the global samples from chaos to order. We further adaptively regulate the neighborhood affinity of the inner samples to constrain the local semantic credibility. In theoretical and empirical analyses, we demonstrate that our algorithm not only propagates from inner to outlier but also prevents local clustering from forming spurious clusters. Empirical evidence demonstrates that CtO outperforms the state of the arts on three public benchmarks: Office-31, Office-Home, and VisDA.
翻译:无源域适应(Source-free domain adaptation, SFDA)仅利用预训练的源模型来适应目标分布,是实现现实世界域适应的一种更通用的方法。然而,由于目标域缺乏监督信息,准确捕捉目标特征的固有结构可能具有挑战性。通过分析目标特征的聚类性能,我们发现这些特征仍包含与判别属性相关的核心特征,但缺乏语义信息的整理。受此启发,我们提出混沌至有序(Chaos to Order, CtO)——一种针对SFDA的新方法,旨在约束语义可信度并在目标子群体间传播标签信息。CtO基于学习状态的自适应阈值将目标数据划分为内部样本和异常样本,从而定制适合数据特性的学习策略。具体而言,内部样本凭借其相对良好的聚类特性被用于学习类内结构。低密度的异常样本则通过输入一致性正则化,以提升对真实标签的预测精度。在CtO中,通过采用不同的学习策略将标签从内部局部区域传播至异常样本,该方法从混沌有序地聚类全局样本。我们进一步自适应调节内部样本的邻域亲密度,以约束局部语义可信度。理论与实证分析表明,我们的算法不仅能实现从内部样本到异常样本的传播,还能防止局部聚类形成虚假簇。实验证据显示,CtO在三个公开基准(Office-31、Office-Home、VisDA)上均优于现有最优方法。