Unsupervised Domain Adaptation (DA) consists of adapting a model trained on a labeled source domain to perform well on an unlabeled target domain with some data distribution shift. While many methods have been proposed in the literature, fair and realistic evaluation remains an open question, particularly due to methodological difficulties in selecting hyperparameters in the unsupervised setting. With SKADA-Bench, we propose a framework to evaluate DA methods and present a fair evaluation of existing shallow algorithms, including reweighting, mapping, and subspace alignment. Realistic hyperparameter selection is performed with nested cross-validation and various unsupervised model selection scores, on both simulated datasets with controlled shifts and real-world datasets across diverse modalities, such as images, text, biomedical, and tabular data with specific feature extraction. Our benchmark highlights the importance of realistic validation and provides practical guidance for real-life applications, with key insights into the choice and impact of model selection approaches. SKADA-Bench is open-source, reproducible, and can be easily extended with novel DA methods, datasets, and model selection criteria without requiring re-evaluating competitors. SKADA-Bench is available on GitHub at https://github.com/scikit-adaptation/skada-bench.
翻译:无监督域自适应旨在将在一个带标签的源域上训练的模型进行适配,使其在存在数据分布偏移的、无标签的目标域上表现良好。尽管文献中已提出许多方法,但公平且真实的评估仍然是一个悬而未决的问题,这尤其源于在无监督设置中选择超参数时面临的方法学困难。通过SKADA-Bench,我们提出了一个评估域自适应方法的框架,并对现有的浅层算法(包括重加权、映射和子空间对齐)进行了公平评估。我们在具有受控偏移的模拟数据集以及跨多种模态的真实世界数据集(如图像、文本、生物医学数据以及经过特定特征提取的表格数据)上,通过嵌套交叉验证和各种无监督模型选择分数来执行真实的超参数选择。我们的基准测试强调了真实验证的重要性,并为实际应用提供了实用指导,同时对模型选择方法的选择及其影响提供了关键见解。SKADA-Bench是开源的、可复现的,并且可以轻松扩展到新的域自适应方法、数据集和模型选择标准,而无需重新评估现有方法。SKADA-Bench可在GitHub上获取:https://github.com/scikit-adaptation/skada-bench。