Multi-domain learning (MDL) aims to train a model with minimal average risk across multiple overlapping but non-identical domains. To tackle the challenges of dataset bias and domain domination, numerous MDL approaches have been proposed from the perspectives of seeking commonalities by aligning distributions to reduce domain gap or reserving differences by implementing domain-specific towers, gates, and even experts. MDL models are becoming more and more complex with sophisticated network architectures or loss functions, introducing extra parameters and enlarging computation costs. In this paper, we propose a frustratingly easy and hyperparameter-free multi-domain learning method named Decoupled Training(D-Train). D-Train is a tri-phase general-to-specific training strategy that first pre-trains on all domains to warm up a root model, then post-trains on each domain by splitting into multi heads, and finally fine-tunes the heads by fixing the backbone, enabling decouple training to achieve domain independence. Despite its extraordinary simplicity and efficiency, D-Train performs remarkably well in extensive evaluations of various datasets from standard benchmarks to applications of satellite imagery and recommender systems.
翻译:多领域学习(MDL)旨在训练一个模型,使其在多个重叠但非同质的领域上具有最小的平均风险。为解决数据集偏差和领域主导的挑战,研究者们从不同角度提出了众多MDL方法,包括通过对齐分布以缩小领域差距来寻求共性,或通过实现领域特定的塔结构、门控机制甚至专家网络来保留差异。随着复杂的网络架构或损失函数的引入,MDL模型变得越来越复杂,这不仅增加了额外参数,还提高了计算成本。本文提出了一种极其简易且无需超参数的多领域学习方法,名为解耦训练(D-Train)。D-Train是一种三阶段从通用到专用的训练策略:首先在所有领域上预训练以初始化根模型,然后通过拆分为多头在每个领域上进行后训练,最后固定主干网络微调各个头,从而实现解耦训练以获得领域独立性。尽管极其简单高效,D-Train在从标准基准到卫星影像和推荐系统应用等各类数据集的广泛评估中均表现出色。