As large language models (LLMs) become increasingly prevalent, reliable methods for detecting AI-generated text are critical for mitigating potential risks. We introduce DependencyAI, a simple and interpretable approach for detecting AI-generated text using only the labels of linguistic dependency relations. Our method achieves competitive performance across monolingual, multi-generator, and multilingual settings. To increase interpretability, we analyze feature importance to reveal syntactic structures that distinguish AI-generated from human-written text. We also observe a systematic overprediction of certain models on unseen domains, suggesting that generator-specific writing styles may affect cross-domain generalization. Overall, our results demonstrate that dependency relations alone provide a robust signal for AI-generated text detection, establishing DependencyAI as a strong linguistically grounded, interpretable, and non-neural network baseline.
翻译:随着大型语言模型(LLM)日益普及,可靠的AI生成文本检测方法对于降低潜在风险至关重要。本文提出DependencyAI,一种仅利用语言依存关系标签进行AI生成文本检测的简洁可解释方法。该方法在单语言、多生成器和多语言场景中均取得具有竞争力的性能。为增强可解释性,我们通过特征重要性分析揭示了区分AI生成文本与人类撰写文本的句法结构。我们还观察到特定模型在未见领域存在系统性过预测现象,表明生成器特有的写作风格可能影响跨领域泛化能力。总体而言,我们的研究结果表明,仅依存关系即可为AI生成文本检测提供稳健信号,从而将DependencyAI确立为一个基于语言学基础、可解释且非神经网络的强基线方法。