Rapidly increasing quality of AI-generated content makes it difficult to distinguish between human and AI-generated texts, which may lead to undesirable consequences for society. Therefore, it becomes increasingly important to study the properties of human texts that are invariant over different text domains and varying proficiency of human writers, can be easily calculated for any language, and can robustly separate natural and AI-generated texts regardless of the generation model and sampling method. In this work, we propose such an invariant for human-written texts, namely the intrinsic dimensionality of the manifold underlying the set of embeddings for a given text sample. We show that the average intrinsic dimensionality of fluent texts in a natural language is hovering around the value $9$ for several alphabet-based languages and around $7$ for Chinese, while the average intrinsic dimensionality of AI-generated texts for each language is $\approx 1.5$ lower, with a clear statistical separation between human-generated and AI-generated distributions. This property allows us to build a score-based artificial text detector. The proposed detector's accuracy is stable over text domains, generator models, and human writer proficiency levels, outperforming SOTA detectors in model-agnostic and cross-domain scenarios by a significant margin.
翻译:人工智能生成内容质量的迅速提升使得区分人类与AI生成文本变得困难,这可能给社会带来不良后果。因此,研究人类文本在不同文本领域和人类作者不同熟练程度下保持不变的特性、可适用于任何语言、并能稳健区分自然文本与AI生成文本(无论生成模型和采样方法如何)变得日益重要。本文提出了这样一种人类文本的不变特性,即给定文本样本嵌入集所构成的流形的本征维数。我们证明,对于多种字母语言,自然语言流畅文本的平均本征维数约在$9$左右,汉语约为$7$,而各语言中AI生成文本的平均本征维数则低约$1.5$,且人类生成与AI生成文本的分布存在清晰的统计分离。该特性使我们能够构建基于得分的人工文本检测器。该检测器的精度在文本领域、生成模型和人类作者熟练程度等维度上保持稳定,在模型无关和跨域场景中均显著优于现有最先进检测器。