Theoretical studies on transfer learning or domain adaptation have so far focused on situations with a known hypothesis class or model; however in practice, some amount of model selection is usually involved, often appearing under the umbrella term of hyperparameter-tuning: for example, one may think of the problem of tuning for the right neural network architecture towards a target task, while leveraging data from a related source task. Now, in addition to the usual tradeoffs on approximation vs estimation errors involved in model selection, this problem brings in a new complexity term, namely, the transfer distance between source and target distributions, which is known to vary with the choice of hypothesis class. We present a first study of this problem, focusing on classification; in particular, the analysis reveals some remarkable phenomena: adaptive rates, i.e., those achievable with no distributional information, can be arbitrarily slower than oracle rates, i.e., when given knowledge on distances.
翻译:关于迁移学习或领域自适应的理论研究迄今主要集中在已知假设类或模型的情形;然而在实践中,模型选择通常涉及一定程度的筛选,往往以超参数调优的统称出现:例如,在利用相关源任务数据为目标任务选择合适的神经网络架构时就会遇到此类问题。现在,除了模型选择中常见的近似误差与估计误差之间的权衡,这个问题引入了一个新的复杂度项,即源分布与目标分布之间的迁移距离——已知该距离随假设类的选择而变化。我们首次对此问题展开研究,聚焦于分类任务;分析尤其揭示了若干显著现象:自适应速率(即无需分布信息即可达到的速率)可能任意慢于预言速率(即已知距离信息时达到的速率)。