When a multi-agent system produces an incorrect or harmful answer, who is accountable if execution logs and agent identifiers are unavailable? In practice, generated content is often detached from its execution environment due to privacy or system boundaries, leaving the final text as the only auditable artifact. Existing attribution methods rely on full execution traces and thus become ineffective in such metadata-deprived settings. We propose Implicit Execution Tracing (IET), a provenance-by-design framework that shifts attribution from post-hoc inference to built-in instrumentation. Instead of reconstructing hidden trajectories, IET embeds agent-specific, key-conditioned statistical signals directly into the token generation process, transforming the output text into a self-verifying execution record. At inference time, we recover a linearized execution trace from the final text via transition-aware statistical scoring. Experiments across diverse multi-agent coordination settings demonstrate that IET achieves accurate segment-level attribution and reliable transition recovery under identity removal, boundary corruption, and privacy-preserving redaction, while maintaining generation quality. These results show that embedding provenance into generation provides a practical and robust foundation for accountability in multi-agent language systems when execution metadata is unavailable.
翻译:当多智能体系统生成错误或有害答案时,若执行日志与智能体标识均不可获取,责任应如何判定?实践中,由于隐私保护或系统边界限制,生成内容常脱离其执行环境,最终文本成为唯一可审计痕迹。现有归因方法依赖完整执行轨迹,在此类元数据缺失场景下失效。我们提出隐式执行追踪(Implicit Execution Tracing, IET),一种融入设计本源的可追溯框架,将归因从事后推断转变为内置检测机制。IET无需重建隐蔽轨迹,而是在令牌生成过程中直接嵌入基于智能体标识与密钥条件的统计信号,使输出文本转化为具备自验证能力的执行记录。在推理阶段,我们通过基于状态转移感知的统计评分机制,从最终文本中重构线性化执行轨迹。跨多种多智能体协调场景的实验表明,在标识消除、边界破坏及隐私保护脱敏条件下,IET能实现精准的段落级归因与可靠的转换恢复,同时保持生成质量。这些结果表明,当执行元数据不可获取时,将可追溯性嵌入生成过程可为多智能体语言系统的问责制提供实用且稳健的基础。