Theoretical studies on transfer learning or domain adaptation have so far focused on situations with a known hypothesis class or model; however in practice, some amount of model selection is usually involved, often appearing under the umbrella term of hyperparameter-tuning: for example, one may think of the problem of tuning for the right neural network architecture towards a target task, while leveraging data from a related source task. Now, in addition to the usual tradeoffs on approximation vs estimation errors involved in model selection, this problem brings in a new complexity term, namely, the transfer distance between source and target distributions, which is known to vary with the choice of hypothesis class. We present a first study of this problem, focusing on classification; in particular, the analysis reveals some remarkable phenomena: adaptive rates, i.e., those achievable with no distributional information, can be arbitrarily slower than oracle rates, i.e., when given knowledge on distances.
翻译:关于迁移学习或领域自适应的理论研究迄今主要聚焦于已知假设类或模型的情境,然而实践中模型选择往往不可或缺且常以超参数调优的形式出现:例如,在利用相关源任务数据的同时,针对目标任务调整神经网络架构的问题。除模型选择中近似误差与估计误差的常规权衡外,该问题引入了一项新的复杂性度量——源分布与目标分布之间的迁移距离,该距离会随假设类选择而变化。我们首次针对此问题展开研究,聚焦于分类任务;分析特别揭示了一些显著现象:可实现无分布信息学习的自适应速率,可能比具备距离信息的先知速率任意慢。