In domain-specific applications, GPT-4, augmented with precise prompts or Retrieval-Augmented Generation (RAG), shows notable potential but faces the critical tri-lemma of performance, cost, and data privacy. High performance requires sophisticated processing techniques, yet managing multiple agents within a complex workflow often proves costly and challenging. To address this, we introduce the PEER (Plan, Execute, Express, Review) multi-agent framework. This systematizes domain-specific tasks by integrating precise question decomposition, advanced information retrieval, comprehensive summarization, and rigorous self-assessment. Given the concerns of cost and data privacy, enterprises are shifting from proprietary models like GPT-4 to custom models, striking a balance between cost, security, and performance. We developed industrial practices leveraging online data and user feedback for efficient model tuning. This study provides best practice guidelines for applying multi-agent systems in domain-specific problem-solving and implementing effective agent tuning strategies. Our empirical studies, particularly in the financial question-answering domain, demonstrate that our approach achieves 95.0% of GPT-4's performance, while effectively managing costs and ensuring data privacy.
翻译:在领域特定应用中,基于精确提示或检索增强生成(RAG)技术增强的GPT-4展现出显著潜力,但面临性能、成本与数据隐私的三重困境。实现高性能需要复杂的处理技术,而在复杂工作流中管理多个智能体往往成本高昂且具有挑战性。为此,我们提出PEER(规划、执行、表达、评审)多智能体框架。该框架通过集成精确问题分解、高级信息检索、全面摘要生成与严格自评估机制,实现了领域特定任务的系统化处理。鉴于成本与数据隐私考量,企业正从GPT-4等专有模型转向定制模型,以在成本、安全与性能间取得平衡。我们开发了利用在线数据和用户反馈进行高效模型调优的工业实践方案。本研究为多智能体系统在领域特定问题求解中的应用及有效智能体调优策略的实施提供了最佳实践指南。我们在金融问答领域的实证研究表明,该方法能以95.0%的性能表现达到GPT-4水平,同时有效控制成本并确保数据隐私。