StegoStylo：通过隐写拼接抑制文体计量审查 (StegoStylo: Squelching Stylometric Scrutiny through Steganographic Stitching)

Stylometry--the identification of an author through analysis of a text's style (i.e., authorship attribution)--serves many constructive purposes: it supports copyright and plagiarism investigations, aids detection of harmful content, offers exploratory cues for certain medical conditions (e.g., early signs of dementia or depression), provides historical context for literary works, and helps uncover misinformation and disinformation. In contrast, when stylometry is employed as a tool for authorship verification--confirming whether a text truly originates from a claimed author--it can also be weaponized for malicious purposes. Techniques such as de-anonymization, re-identification, tracking, profiling, and downstream effects like censorship illustrate the privacy threats that stylometric analysis can enable. Building on these concerns, this paper further explores how adversarial stylometry combined with steganography can counteract stylometric analysis. We first present enhancements to our adversarial attack, $\textit{TraceTarnish}$, providing stronger evidence of its capacity to confound stylometric systems and reduce their attribution and verification accuracy. Next, we examine how steganographic embedding can be fine-tuned to mask an author's stylistic fingerprint, quantifying the level of authorship obfuscation achievable as a function of the proportion of words altered with zero-width Unicode characters. Based on our findings, steganographic coverage of 33% or higher seemingly ensures authorship obfuscation. Finally, we reflect on the ways stylometry can be used to undermine privacy and argue for the necessity of defensive tools like $\textit{TraceTarnish}$.

翻译：文体计量学——通过分析文本风格识别作者（即作者归属判定）——具有诸多建设性用途：支持版权与抄袭调查、协助有害内容检测、为某些医学状况（如痴呆症或抑郁症的早期迹象）提供探索性线索、为文学作品提供历史背景，以及帮助揭露虚假与误导信息。然而，当文体计量学被用作作者验证工具——确认文本是否确实源自声称的作者时——它也可能被武器化用于恶意目的。去匿名化、重识别、追踪、画像等技术，以及审查制度等下游效应，都说明了文体计量分析可能引发的隐私威胁。基于这些担忧，本文进一步探讨了对抗性文体计量学与隐写术相结合如何能对抗文体计量分析。我们首先改进了对抗性攻击方法 $\textit{TraceTarnish}$，提供了更强证据表明其能够混淆文体计量系统并降低其归属判定与验证的准确性。接着，我们研究了如何微调隐写嵌入以掩盖作者的风格指纹，量化了通过零宽度Unicode字符修改单词比例所能实现的作者身份混淆程度。根据我们的发现，33%或更高的隐写覆盖率似乎能确保作者身份混淆。最后，我们反思了文体计量学可能被用于破坏隐私的方式，并论证了诸如 $\textit{TraceTarnish}$ 这类防御工具的必要性。