The rapid advancement of Large Language Models (LLMs) has led to their increased integration into mobile devices for personalized assistance, which enables LLMs to call external API functions to enhance their performance. However, challenges such as data scarcity, ineffective question formatting, and catastrophic forgetting hinder the development of on-device LLM agents. To tackle these issues, we propose Alopex, a framework that enables precise on-device function calls using the Fox LLM. Alopex introduces a logic-based method for generating high-quality training data and a novel ``description-question-output'' format for fine-tuning, reducing risks of function information leakage. Additionally, a data mixing strategy is used to mitigate catastrophic forgetting, combining function call data with textbook datasets to enhance performance in various tasks. Experimental results show that Alopex improves function call accuracy and significantly reduces catastrophic forgetting, providing a robust solution for integrating function call capabilities into LLMs without manual intervention.
翻译:大型语言模型(LLM)的快速发展推动了其在移动设备上的集成,以实现个性化辅助功能,这使得LLM能够调用外部API函数以提升性能。然而,数据稀缺、问题格式无效以及灾难性遗忘等挑战阻碍了设备端LLM智能体的发展。为解决这些问题,我们提出了Alopex框架,该框架利用Fox LLM实现了精确的设备端函数调用。Alopex引入了一种基于逻辑的方法来生成高质量训练数据,并提出了一种新颖的“描述-问题-输出”格式进行微调,从而降低了函数信息泄露的风险。此外,通过采用数据混合策略,将函数调用数据与教科书数据集相结合,以缓解灾难性遗忘并提升模型在多种任务上的性能。实验结果表明,Alopex提高了函数调用的准确性,并显著减轻了灾难性遗忘,为无需人工干预即可将函数调用能力集成到LLM中提供了一个稳健的解决方案。