When a multi-agent system produces an incorrect or harmful answer, who is accountable if execution logs and agent identifiers are unavailable? In practice, generated content is often detached from its execution environment due to privacy or system boundaries, leaving the final text as the only auditable artifact. Existing attribution methods rely on full execution traces and thus become ineffective in such metadata-deprived settings. We propose Implicit Execution Tracing (IET), a provenance-by-design framework that shifts attribution from post-hoc inference to built-in instrumentation. Instead of reconstructing hidden trajectories, IET embeds agent-specific, key-conditioned statistical signals directly into the token generation process, transforming the output text into a self-verifying execution record. At inference time, we recover a linearized execution trace from the final text via transition-aware statistical scoring. Experiments across diverse multi-agent coordination settings demonstrate that IET achieves accurate segment-level attribution and reliable transition recovery under identity removal, boundary corruption, and privacy-preserving redaction, while maintaining generation quality. These results show that embedding provenance into generation provides a practical and robust foundation for accountability in multi-agent language systems when execution metadata is unavailable.
翻译:当多智能体系统生成错误或有害答案时,若执行日志与智能体标识不可用,应追究谁的责任?在实践中,由于隐私或系统边界限制,生成内容常与执行环境分离,使得最终文本成为唯一可审计的产物。现有归因方法依赖完整执行轨迹,因此在缺乏元数据的场景中失效。我们提出隐式执行追踪(Implicit Execution Tracing, IET),这是一种面向可溯源性设计的框架,将归因从事后推断转变为内建检测机制。IET并非重构隐藏轨迹,而是将智能体特定、密钥条件化的统计信号直接嵌入令牌生成过程,使输出文本转化为自验证的执行记录。在推理阶段,我们通过状态转换感知的统计评分,从最终文本中恢复线性化的执行轨迹。跨多种多智能体协调设置的实验表明,在标识移除、边界破坏及隐私保护编辑场景下,IET能在保持生成质量的同时,实现精准的片段级归因与可靠的状态转换恢复。这些结果表明,当执行元数据不可用时,将溯源信息嵌入生成过程可为多智能体语言系统的问责机制提供实用且稳健的基础。