From Agent Traces to Trust: Evidence Tracing and Execution Provenance in LLM Agents

Large language model (LLM)-based agents increasingly solve complex tasks by interacting with external tools, retrieval systems, memory modules, environments, and other agents. These capabilities expand agent autonomy, but also make agent behavior harder to verify, debug, and audit. Final-answer accuracy alone cannot explain how an output was produced, which evidence supported each claim, whether tool calls were justified, how memory influenced later decisions, or where execution failures originated. Evidence tracing and execution provenance address this gap by modeling how retrieved evidence, tool outputs, memory items, environment observations, intermediate claims, actions, and final answers are connected throughout agent execution. This survey provides a systematic review and conceptual framework for evidence tracing and execution provenance in LLM agents. We organize related work around a unified provenance perspective that connects retrieval grounding, claim support, tool-use safety, memory lineage, observability, debugging, audit, and recovery. We introduce a taxonomy covering trace sources, evidence and execution units, provenance relations, tracing granularity and timing, representation forms, and trust functions. We review key methodological directions, including provenance representation, evidence attribution, tool-use provenance, runtime guardrails, provenance-bearing memory, trace-based observability, and failure diagnosis. We also map existing benchmarks, datasets, and evaluation metrics to provenance-related capabilities, and discuss how evaluation can move from final-answer correctness toward process-level accountability. Finally, we outline open challenges, including unified trace schemas, claim-level and semantic provenance, provenance-aware safety mechanisms, realistic execution-trace benchmarks, recovery-oriented evaluation, and privacy-aware audit infrastructure.

翻译：基于大语言模型的智能体通过交互外部工具、检索系统、记忆模块、环境及其他智能体，日益解决复杂任务。这些能力增强了智能体的自主性，但也使其行为更难以验证、调试和审计。仅凭最终答案的准确性无法解释输出如何生成、每个论断由哪些证据支持、工具调用是否合理、记忆如何影响后续决策、或执行故障源于何处。证据溯源与执行溯源通过建模检索证据、工具输出、记忆项、环境观测、中间论断、动作及最终答案在智能体执行过程中的连接方式，弥补了这一空白。本综述对LLM智能体中的证据溯源与执行溯源进行了系统梳理并构建了概念框架。我们围绕统一的溯源视角组织相关工作，该视角关联检索依据、论断支持、工具使用安全、记忆谱系、可观测性、调试、审计与恢复。我们引入了一个分类体系，涵盖溯源来源、证据与执行单元、溯源关系、溯源粒度与时机、表征形式及信任功能。我们综述了关键方法论方向，包括溯源表征、证据归因、工具使用溯源、运行时防护、携带溯源的记忆、基于轨迹的可观测性及故障诊断。我们还梳理了现有的基准测试、数据集和评估指标与溯源相关能力的对应关系，并讨论了如何将评估从最终答案正确性转向过程级问责。最后，我们概述了开放挑战，包括统一轨迹模式、论断级与语义溯源、溯源感知的安全机制、真实执行轨迹基准、面向恢复的评估及隐私感知的审计基础设施。