Decision-making is a dynamic process requiring perception, memory, and reasoning to make choices and find optimal policies. Traditional approaches to decision-making suffer from sample efficiency and generalization, while large-scale self-supervised pretraining has enabled fast adaptation with fine-tuning or few-shot learning in language and vision. We thus argue to integrate knowledge acquired from generic large-scale self-supervised pretraining into downstream decision-making problems. We propose Pretrain-Then-Adapt pipeline and survey recent work on data collection, pretraining objectives and adaptation strategies for decision-making pretraining and downstream inference. Finally, we identify critical challenges and future directions for developing decision foundation model with the help of generic and flexible self-supervised pretraining.
翻译:决策制定是一个动态过程,需要感知、记忆和推理来做出选择并找到最优策略。传统决策方法在样本效率和泛化能力上存在不足,而大规模自监督预训练已使语言和视觉领域能够通过微调或小样本学习实现快速适应。因此,我们主张将通用大规模自监督预训练中获得的知识整合到下游决策问题中。我们提出预训练-自适应流程,并系统梳理了决策预训练与下游推理中关于数据采集、预训练目标及自适应策略的最新研究。最后,我们指出了借助通用且灵活的自监督预训练来发展决策基础模型所面临的关键挑战与未来方向。