Contemporary artificial intelligence systems exhibit rapidly growing abilities accompanied by the growth of required resources, expansive datasets and corresponding investments into computing infrastructure. Although earlier successes predominantly focus on constrained settings, recent strides in fundamental research and applications aspire to create increasingly general systems. This evolving landscape presents a dual panorama of opportunities and challenges in refining the generalisation and transfer of knowledge - the extraction from existing sources and adaptation as a comprehensive foundation for tackling new problems. Within the domain of reinforcement learning (RL), the representation of knowledge manifests through various modalities, including dynamics and reward models, value functions, policies, and the original data. This taxonomy systematically targets these modalities and frames its discussion based on their inherent properties and alignment with different objectives and mechanisms for transfer. Where possible, we aim to provide coarse guidance delineating approaches which address requirements such as limiting environment interactions, maximising computational efficiency, and enhancing generalisation across varying axes of change. Finally, we analyse reasons contributing to the prevalence or scarcity of specific forms of transfer, the inherent potential behind pushing these frontiers, and underscore the significance of transitioning from designed to learned transfer.
翻译:当代人工智能系统展现出快速增强的能力,但同时也伴随着所需资源、大规模数据集以及相应计算基础设施投入的持续增长。尽管早期成功主要聚焦于受限环境,但近期基础研究与应用的突破性进展正致力于构建日益通用的系统。这一演变格局为完善知识的泛化与迁移——即从现有资源中提取知识并加以适配,从而为应对新问题奠定全面基础——带来了机遇与挑战并存的双重图景。在强化学习(RL)领域,知识通过多种模态得以表征,包括动力学模型与奖励模型、价值函数、策略以及原始数据。本分类法系统性地针对这些模态展开研究,并基于其固有属性以及在不同迁移目标与机制上的适配性来组织讨论。在可能的情况下,我们旨在提供粗粒度的指导性意见,以区分那些致力于解决特定需求的方法,例如限制环境交互次数、最大化计算效率、以及增强跨不同变化轴的泛化能力。最后,我们分析了导致特定迁移形式占据主导或稀缺的原因、推进这些前沿方向所蕴含的潜在价值,并强调了从预设迁移向学习迁移转变的重要性。