Large language models (LLMs) are increasingly embedded in consequential decisions across healthcare, finance, employment, and public services. Yet accountability remains fragile because process transparency is rarely recorded in a durable and reviewable form. We propose LLM audit trails as a sociotechnical mechanism for continuous accountability. An audit trail is a chronological, tamper-evident, context-rich ledger of lifecycle events and decisions that links technical provenance (models, data, training and evaluation runs, deployments, monitoring) with governance records (approvals, waivers, and attestations), so organizations can reconstruct what changed, when, and who authorized it. This paper contributes: (1) a lifecycle framework that specifies event types, required metadata, and governance rationales; (2) a reference architecture with lightweight emitters, append only audit stores, and an auditor interface supporting cross organizational traceability; and (3) a reusable, open-source Python implementation that instantiates this audit layer in LLM workflows with minimal integration effort. We conclude by discussing limitations and directions for adoption.
翻译:大语言模型(LLMs)正日益广泛应用于医疗保健、金融、就业和公共服务等领域的关键决策中。然而,由于流程透明度很少以持久且可审查的形式记录,其问责机制依然脆弱。我们提出将LLM审计追踪作为一种实现持续问责的社会技术机制。审计追踪是一种按时间顺序排列、具备防篡改特性且包含丰富上下文的生命周期事件与决策记录,它将技术溯源信息(模型、数据、训练与评估过程、部署、监控)与治理记录(审批、豁免与认证)相关联,使组织能够追溯变更内容、时间及授权主体。本文贡献包括:(1)一个明确事件类型、必需元数据及治理依据的生命周期框架;(2)一个包含轻量级事件发射器、仅追加审计存储库以及支持跨组织可追溯性的审计接口的参考架构;(3)一个可复用的开源Python实现,能够以最小集成成本在LLM工作流中实例化该审计层。最后,我们讨论了该机制的局限性与应用推广方向。