Context adaptation automates prompt engineering in LLM-based systems by iteratively revising tunable prompts from task feedback, without modifying model weights. Extending this paradigm to multi-LLM agentic systems is crucial: existing methods suffer from inaccurate credit assignment and lack convergence guarantees. We propose \textbf{G}raph-based \textbf{T}arget \textbf{B}ack-\textbf{P}ropagation (GTBP), a context adaptation framework for agentic workflows modeled as directed acyclic graphs. GTBP propagates local target outputs backward through the workflow graph and uses target--output discrepancies to guide a stage-wise prompt update mechanism. Theoretically, we show that GTBP's stage-wise prompt updates become stable over iterations, and that a sufficiently capable LLM optimizer can decrease the overall objective. Empirically, GTBP consistently outperforms strong baselines across three benchmarks while maintaining comparable computational cost.
翻译:上下文适应通过迭代地从任务反馈中修正可调提示,自动实现基于LLM系统中的提示工程,而无需修改模型权重。将这一范式扩展到多LLM智能体系统至关重要:现有方法存在信用分配不准确且缺乏收敛保证的问题。我们提出基于图的目标反向传播(GTBP),这是一种针对建模为有向无环图的智能体工作流的上下文适应框架。GTBP通过工作流图反向传播局部目标输出,并利用目标-输出差异指导分阶段提示更新机制。理论上,我们证明GTBP的分阶段提示更新在迭代中趋于稳定,且能力足够强的LLM优化器能够降低总体目标。实证方面,GTBP在三个基准测试中持续优于强基线方法,同时保持相当的计算成本。