Post-training of foundation language models has emerged as a promising research domain in federated learning (FL) with the goal to enable privacy-preserving model improvements and adaptations to user's downstream tasks. Recent advances in this area adopt centralized post-training approaches that build upon black-box foundation language models where there is no access to model weights and architecture details. Although the use of black-box models has been successful in centralized post-training, their blind replication in FL raises several concerns. Our opinion is that using black-box models in FL contradicts the core principles of federation such as data privacy and autonomy. In this paper, we critically analyze the usage of black-box models in federated post-training, and provide a detailed account of various aspects of openness and their implications for FL.
翻译:基础语言模型的后训练已成为联邦学习中一个前景广阔的研究领域,其目标是在保护隐私的前提下改进模型并适应用户的下游任务。该领域的最新进展采用集中式后训练方法,这些方法基于黑盒基础语言模型,即无法访问模型权重与架构细节。尽管黑盒模型在集中式后训练中取得了成功,但在联邦学习中盲目复制这些模型引发了几点问题。我们的观点是,在联邦学习中使用黑盒模型违背了数据隐私和自治等联邦核心原则。本文批判性地分析了黑盒模型在联邦后训练中的应用,并详细阐述了开放性的多个方面及其对联邦学习的影响。