法律语言在大型语言模型中的归因分析 (Attribution analysis of legal language as used by LLM)

Three publicly-available LLM specifically designed for legal tasks have been implemented and shown that classification accuracy can benefit from training over legal corpora, but why and how? Here we use two publicly-available legal datasets, a simpler binary classification task of ``overruling'' texts, and a more elaborate multiple choice task identifying ``holding'' judicial decisions. We report on experiments contrasting the legal LLM and a generic BERT model for comparison, against both datasets. We use integrated gradient attribution techniques to impute ``causes'' of variation in the models' perfomance, and characterize them in terms of the tokenizations each use. We find that while all models can correctly classify some test examples from the casehold task, other examples can only be identified by only one, model, and attribution can be used to highlight the reasons for this. We find that differential behavior of the models' tokenizers accounts for most of the difference and analyze these differences in terms of the legal language they process. Frequency analysis of tokens generated by dataset texts, combined with use of known ``stop word'' lists, allow identification of tokens that are clear signifiers of legal topics.

翻译：本文实现了三个公开可用的、专门为法律任务设计的大型语言模型，并证明通过对法律语料库进行训练可以提高分类准确性，但其原因和机制是什么？在此，我们使用两个公开可用的法律数据集：一个较简单的“推翻先例”文本二分类任务，以及一个更复杂的识别“判决理由”的多选任务。我们报告了对比法律专用大型语言模型与通用BERT模型在这两个数据集上的实验。采用积分梯度归因技术来推断模型性能差异的“成因”，并根据各模型使用的分词方式对其进行表征。研究发现，虽然所有模型都能正确分类casehold任务中的部分测试样本，但其他样本仅能被单一模型识别，而归因分析可用于揭示这一现象的原因。我们发现模型分词器的差异行为是性能差异的主要来源，并从其处理的法律语言角度分析了这些差异。通过对数据集文本生成的词元进行频率分析，并结合已知的“停用词”列表，能够识别出明确指示法律主题的特征词元。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日