Authorship Verification (AV) is the process of analyzing a set of documents to determine whether they were written by a specific author. This problem often arises in forensic scenarios, e.g., in cases where the documents in question constitute evidence for a crime. Existing state-of-the-art AV methods use computational solutions that are not supported by a plausible scientific explanation for their functioning and that are often difficult for analysts to interpret. To address this, we propose a method relying on calculating a quantity we call $\lambda_G$ (LambdaG): the ratio between the likelihood of a document given a model of the Grammar for the candidate author and the likelihood of the same document given a model of the Grammar for a reference population. These Grammar Models are estimated using $n$-gram language models that are trained solely on grammatical features. Despite not needing large amounts of data for training, LambdaG still outperforms other established AV methods with higher computational complexity, including a fine-tuned Siamese Transformer network. Our empirical evaluation based on four baseline methods applied to twelve datasets shows that LambdaG leads to better results in terms of both accuracy and AUC in eleven cases and in all twelve cases if considering only topic-agnostic methods. The algorithm is also highly robust to important variations in the genre of the reference population in many cross-genre comparisons. In addition to these properties, we demonstrate how LambdaG is easier to interpret than the current state-of-the-art. We argue that the advantage of LambdaG over other methods is due to fact that it is compatible with Cognitive Linguistic theories of language processing.
翻译:作者身份验证(AV)是分析一组文档以确定它们是否由特定作者撰写的过程。该问题常出现在法医场景中,例如当涉案文档构成犯罪证据时。现有最先进的AV方法采用缺乏合理解释其运作机制的计算方案,且往往难以被分析师解读。为此,我们提出一种基于计算量$lambda_G$(LambdaG)的方法:候选作者的语法模型下文档似然与参照人群语法模型下文档似然之比。这些语法模型通过仅训练于语法特征的n元语言模型估计而得。尽管无需大量训练数据,LambdaG仍能超越包括微调孪生Transformer网络在内、计算复杂度更高的其他成熟AV方法。基于四种基线方法在十二个数据集上的实证评估表明:在十一种情况下,LambdaG在准确率和AUC指标上均更优;若仅考虑主题无关方法,则在所有十二个案例中表现更佳。该算法在多种跨流派比较中对参照人群体裁的重要变化具有高度鲁棒性。除这些特性外,我们证明LambdaG比当前最先进方法更易解读。我们认为LambdaG优于其他方法的原因在于其与认知语言学语言处理理论的相容性。