We study a problem of best-effort adaptation motivated by several applications and considerations, which consists of determining an accurate predictor for a target domain, for which a moderate amount of labeled samples are available, while leveraging information from another domain for which substantially more labeled samples are at one's disposal. We present a new and general discrepancy-based theoretical analysis of sample reweighting methods, including bounds holding uniformly over the weights. We show how these bounds can guide the design of learning algorithms that we discuss in detail. We further show that our learning guarantees and algorithms provide improved solutions for standard domain adaptation problems, for which few labeled data or none are available from the target domain. We finally report the results of a series of experiments demonstrating the effectiveness of our best-effort adaptation and domain adaptation algorithms, as well as comparisons with several baselines. We also discuss how our analysis can benefit the design of principled solutions for fine-tuning.
翻译:我们研究了一个由多种应用和考量驱动的最大努力自适应问题,该问题涉及在目标域仅有中等数量标签样本的情况下,利用另一拥有大量标签样本的域的信息,确定一个准确的预测器。我们提出了一种基于差异的新颖通用理论分析,用于样本重加权方法,包括均匀覆盖权重的界限。我们展示了这些界限如何指导学习算法的设计,并对此进行了详细讨论。我们进一步证明,我们的学习保证和算法为标准域自适应问题(其中目标域仅有少量或无标签数据可用)提供了改进的解决方案。最后,我们报告了一系列实验结果,证明了我们的最大努力自适应和域自适应算法的有效性,并与多个基准进行了比较。我们还讨论了我们的分析如何有益于为微调设计原则性解决方案。