Most prompt-injection detectors score a single event or message. Control-plane attacks against tool-using agents can instead distribute weak directives across a trajectory while keeping each event below threshold. We test whether a proxy-side temporal accumulator recovers this slow-burn signal by reducing frozen per-event scores to peak and CUSUM persistence statistics. To avoid circularity, grafts are generated against a held-out autoregressive cloaking target and then re-scored under a detector of record: a frozen char-ngram SVM plus an embedding-contrastive head. Only floor-met grafts bound to executed action edges and still sub-threshold under the detector of record enter the slow-burn endpoint. This is a boundary result, not a deployable detector. On concentrated attacks, trajectory-level accumulation beats the per-event foil under a clustered bootstrap (gap +0.092, 95% CI [+0.025, +0.155]), while persistence and peak are statistically tied. On git repo-exfil, density-four floor-met sub-threshold grafts add persistence mass that matched benign shams do not (persistence-delta AUC 0.708 over four attack survivors and six benign shams), while the matched peak-delta control does not separate attack from sham (AUC 0.417), localizing the effect to accumulated persistence rather than a single hot graft. The effect fails on broader clean-path actions (persistence-delta AUC 0.167), where the detector assigns attack and benign actions indistinguishable per-event scores, leaving no margin for CUSUM to bank. Independent powering is blocked by only three to four independent tasks. Temporal accumulation is therefore a narrow-band margin amplifier: it can bank elevated sub-threshold signal but cannot create margin where the per-event detector has none. As byproducts, we contribute a pseudo-replication warning and an independence-audit standard for agent-benchmark evaluation.
翻译:暂无翻译