We evaluated the capability of a state-of-the-art generative pre-trained transformer (GPT) model to perform semantic annotation of short text snippets (one to few sentences) coming from legal documents of various types. Discussions of potential uses (e.g., document drafting, summarization) of this emerging technology in legal domain have intensified, but to date there has not been a rigorous analysis of these large language models' (LLM) capacity in sentence-level semantic annotation of legal texts in zero-shot learning settings. Yet, this particular type of use could unlock many practical applications (e.g., in contract review) and research opportunities (e.g., in empirical legal studies). We fill the gap with this study. We examined if and how successfully the model can semantically annotate small batches of short text snippets (10-50) based exclusively on concise definitions of the semantic types. We found that the GPT model performs surprisingly well in zero-shot settings on diverse types of documents (F1=.73 on a task involving court opinions, .86 for contracts, and .54 for statutes and regulations). These findings can be leveraged by legal scholars and practicing lawyers alike to guide their decisions in integrating LLMs in wide range of workflows involving semantic annotation of legal texts.
翻译:我们评估了最先进的生成式预训练Transformer(GPT)模型对来自各类法律文档的短文本片段(一至数句)进行语义标注的能力。尽管关于这一新兴技术在法律领域潜在用途(如文件起草、摘要生成)的讨论日益增多,但迄今尚未有严格分析这些大语言模型(LLM)在法律文本的句子级语义标注中于零样本学习设置下表现的研究。然而,这一特定类型的应用可能解锁众多实际用途(如合同审查)和研究机会(如实证法律研究)。本研究填补了这一空白。我们考察了该模型是否及如何基于仅有的语义类型简明定义,成功对小批量短文本片段(10-50句)进行语义标注。研究发现,GPT模型在零样本设置下对各类文档表现出色(涉及法院判决书的任务F1=0.73,合同F1=0.86,法律条文与法规F1=0.54)。这些发现可供法律学者和执业律师参考,指导其在涉及法律文本语义标注的广泛工作流程中整合LLM的决策。