Multitask learning and related frameworks have achieved tremendous success in modern applications. In multitask learning problem, we are given a set of heterogeneous datasets collected from related source tasks and hope to enhance the performance above what we could hope to achieve by solving each of them individually. The recent work of arXiv:2006.15785 has showed that, without access to distributional information, no algorithm based on aggregating samples alone can guarantee optimal risk as long as the sample size per task is bounded. In this paper, we focus on understanding the statistical limits of multitask learning. We go beyond the no-free-lunch theorem in arXiv:2006.15785 by establishing a stronger impossibility result of adaptation that holds for arbitrarily large sample size per task. This improvement conveys an important message that the hardness of multitask learning cannot be overcame by having abundant data per task. We also discuss the notion of optimal adaptivity that may be of future interests.
翻译:多任务学习及其相关框架在现代应用中取得了巨大成功。在多任务学习问题中,我们获得一组从相关源任务收集的异构数据集,并期望将性能提升至超越单独解决每个任务所能达到的水平。近期工作arXiv:2006.15785表明,在无法获取分布信息的情况下,只要每个任务的样本量有限,任何仅基于聚合样本的算法都无法保证获得最优风险。本文聚焦于理解多任务学习的统计极限。我们通过建立适用于任意大单任务样本量的更强适应不可能性定理,超越了arXiv:2006.15785中的"没有免费午餐"定理。这一改进传递了重要信息:多任务学习的困难性无法通过增加单任务数据量来克服。我们还讨论了可能具有未来研究价值的最优适应性概念。