The permanence of online content combined with the enhanced authorship identification techniques calls for stronger computational methods to protect the identity and privacy of online authorship when needed, e.g., blind reviews for scientific papers, anonymous online reviews, or anonymous interactions in the mental health forums. In this paper, we propose an unsupervised inference-time approach to authorship obfuscation to address the unique challenges of authorship obfuscation: lack of supervision data for diverse authorship and domains, and the need for a sufficient level of revision beyond simple paraphrasing to obfuscate the authorship, all the while preserving the original content and fluency. We introduce JAMDEC, a user-controlled, inference-time algorithm for authorship obfuscation that can be in principle applied to any text and authorship. Our approach builds on small language models such as GPT2-XL in order to help avoid disclosing the original content to proprietary LLM's APIs, while also reducing the performance gap between small and large language models via algorithmic enhancement. The key idea behind our approach is to boost the creative power of smaller language models through constrained decoding, while also allowing for user-specified controls and flexibility. Experimental results demonstrate that our approach based on GPT2-XL outperforms previous state-of-the-art methods based on comparably small models, while performing competitively against GPT3.5 175B, a propriety model that is two orders of magnitudes larger.
翻译:在线内容的持久性结合日益增强的作者身份识别技术,亟需更强大的计算方法来保护作者身份与隐私(例如科学论文的盲审、匿名在线评论或心理健康论坛中的匿名交流)。本文提出了一种无监督推理阶段的作者身份混淆方法,以应对该领域的独特挑战:缺乏用于多样化作者与领域的有监督数据,需要超越简单释义的充分修订以模糊作者身份,同时保留原始内容与流畅度。我们引入JAMDEC——一种用户可控的推理阶段作者身份混淆算法,原则上可适用于任何文本与作者。该方法基于GPT2-XL等小型语言模型构建,既能避免向专有大型语言模型API泄露原始内容,又通过算法增强缩小了小型与大型语言模型之间的性能差距。其核心思想是通过约束解码提升小型语言模型的创造力,同时允许用户自定义控制与灵活性。实验结果表明,基于GPT2-XL的方法在性能上超越以往基于同等规模小模型的现有最优方法,且与规模大两个数量级的专有模型GPT3.5 175B相比具有竞争力。