In domain-specific applications, GPT-4, augmented with precise prompts or Retrieval-Augmented Generation (RAG), shows notable potential but faces the critical tri-lemma of performance, cost, and data privacy. High performance requires sophisticated processing techniques, yet managing multiple agents within a complex workflow often proves costly and challenging. To address this, we introduce the PEER (Plan, Execute, Express, Review) multi-agent framework. This systematizes domain-specific tasks by integrating precise question decomposition, advanced information retrieval, comprehensive summarization, and rigorous self-assessment. Given the concerns of cost and data privacy, enterprises are shifting from proprietary models like GPT-4 to custom models, striking a balance between cost, security, and performance. We developed industrial practices leveraging online data and user feedback for efficient model tuning. This study provides best practice guidelines for applying multi-agent systems in domain-specific problem-solving and implementing effective agent tuning strategies. Our empirical studies, particularly in the financial question-answering domain, demonstrate that our approach achieves 95.0% of GPT-4's performance, while effectively managing costs and ensuring data privacy.
翻译:在领域特定应用中,GPT-4 结合精确提示或检索增强生成(RAG)展现出显著潜力,但面临性能、成本与数据隐私的关键三难困境。高性能需要复杂的处理技术,然而在复杂工作流中管理多个智能体通常成本高昂且具有挑战性。为解决此问题,我们提出了 PEER(规划、执行、表达、评审)多智能体框架。该框架通过集成精确的问题分解、高级信息检索、全面总结和严格自评估,实现了领域特定任务的系统化。鉴于成本与数据隐私的考量,企业正从 GPT-4 等专有模型转向定制模型,以在成本、安全与性能间取得平衡。我们开发了利用在线数据和用户反馈进行高效模型调优的工业实践。本研究为在领域特定问题求解中应用多智能体系统以及实施有效的智能体调优策略提供了最佳实践指南。我们的实证研究,特别是在金融问答领域,表明我们的方法达到了 GPT-4 性能的 95.0%,同时有效控制了成本并确保了数据隐私。