Mixed-precision computing, a widely applied technique in AI, offers a larger trade-off space between accuracy and efficiency. The recent purposed Mixed-Precision Over-the-Air Federated Learning (MP-OTA-FL) enables clients to operate at appropriate precision levels based on their heterogeneous hardware, taking advantages of the larger trade-off space while covering the quantization overheads in the mixed-precision modulation scheme for the OTA aggregation process. A key to further exploring the potential of the MP-OTA-FL framework is the optimization of client precision levels. The choice of precision level hinges on multifaceted factors including hardware capability, potential client contribution, and user satisfaction, among which factors can be difficult to define or quantify. In this paper, we propose a RAG-based User Profiling for precision planning framework that integrates retrieval-augmented LLMs and dynamic client profiling to optimize satisfaction and contributions. This includes a hybrid interface for gathering device/user insights and an RAG database storing historical quantization decisions with feedback. Experiments show that our method boosts satisfaction, energy savings, and global model accuracy in MP-OTA-FL systems.
翻译:混合精度计算作为人工智能领域广泛应用的技术,在精度与效率之间提供了更大的权衡空间。近期提出的混合精度空中联邦学习(MP-OTA-FL)允许客户端根据其异构硬件在适当的精度级别上运行,既利用了更大的权衡空间,又通过混合精度调制方案覆盖了空中聚合过程中的量化开销。进一步挖掘MP-OTA-FL框架潜力的关键在于客户端精度级别的优化。精度级别的选择取决于硬件能力、潜在客户端贡献和用户满意度等多方面因素,其中某些因素难以定义或量化。本文提出了一种基于RAG的用户画像精度规划框架,该框架集成检索增强型大语言模型与动态客户端画像,以优化满意度与贡献度。该框架包含用于收集设备/用户洞察的混合接口,以及存储带反馈的历史量化决策的RAG数据库。实验表明,我们的方法在MP-OTA-FL系统中显著提升了用户满意度、节能效果和全局模型精度。