基于击键时序的AI作者身份检测的安全性分析：针对运动信号验证的时序伪造攻击 (On the Insecurity of Keystroke-Based AI Authorship Detection: Timing-Forgery Attacks Against Motor-Signal Verification)

Recent proposals advocate using keystroke timing signals, specifically the coefficient of variation ($δ$) of inter-keystroke intervals, to distinguish human-composed text from AI-generated content. We demonstrate that this class of defenses is insecure against two practical attack classes: the copy-type attack, in which a human transcribes LLM-generated text producing authentic motor signals, and timing-forgery attacks, in which automated agents sample inter-keystroke intervals from empirical human distributions. Using 13,000 sessions from the SBU corpus and three timing-forgery variants (histogram sampling, statistical impersonation, and generative LSTM), we show all attacks achieve $\ge$99.8% evasion rates against five classifiers. While detectors achieve AUC=1.000 against fully-automated injection, they classify $\ge$99.8% of attack samples as human with mean confidence $\ge$0.993. We formalize a non-identifiability result: when the detector observes only timing, the mutual information between features and content provenance is zero for copy-type attacks. Although composition and transcription produce statistically distinguishable motor patterns (Cohen's d=1.28), both yield $δ$ values 2-4x above detection thresholds, rendering the distinction security-irrelevant. These systems confirm a human operated the keyboard, but not whether that human originated the text. Securing provenance requires architectures that bind the writing process to semantic content.

翻译：近期研究提出利用击键时序信号——特别是击键间隔变异系数（$δ$）——来区分人工撰写文本与AI生成内容。本文证明此类防御机制在面对两类实际攻击时存在安全隐患：其一是抄录型攻击，即人工转录LLM生成文本时产生真实的运动信号；其二是时序伪造攻击，即自动化代理从经验性人类分布中采样击键间隔。通过使用SBU语料库中的13,000个会话数据及三种时序伪造变体（直方图采样、统计模拟和生成式LSTM），我们证明所有攻击对五种分类器的规避率均达到$\ge$99.8%。虽然检测器对全自动注入攻击可实现AUC=1.000的检测性能，但其将$\ge$99.8%的攻击样本判定为人类撰写，平均置信度达$\ge$0.993。我们形式化证明了不可识别性结论：当检测器仅观测时序信号时，对于抄录型攻击，特征与内容来源之间的互信息为零。尽管原创撰写与转录行为会产生统计可区分的运动模式（科恩d值=1.28），但二者产生的$δ$值均超出检测阈值2-4倍，使得该差异在安全层面失去意义。此类系统仅能确认键盘操作者是人类，但无法判定文本是否由该操作者原创。要确保来源可信性，需要构建能将写作过程与语义内容相绑定的系统架构。