Automatic Information Extraction From Employment Tribunal Judgements Using Large Language Models

Court transcripts and judgments are rich repositories of legal knowledge, detailing the intricacies of cases and the rationale behind judicial decisions. The extraction of key information from these documents provides a concise overview of a case, crucial for both legal experts and the public. With the advent of large language models (LLMs), automatic information extraction has become increasingly feasible and efficient. This paper presents a comprehensive study on the application of GPT-4, a large language model, for automatic information extraction from UK Employment Tribunal (UKET) cases. We meticulously evaluated GPT-4's performance in extracting critical information with a manual verification process to ensure the accuracy and relevance of the extracted data. Our research is structured around two primary extraction tasks: the first involves a general extraction of eight key aspects that hold significance for both legal specialists and the general public, including the facts of the case, the claims made, references to legal statutes, references to precedents, general case outcomes and corresponding labels, detailed order and remedies and reasons for the decision. The second task is more focused, aimed at analysing three of those extracted features, namely facts, claims and outcomes, in order to facilitate the development of a tool capable of predicting the outcome of employment law disputes. Through our analysis, we demonstrate that LLMs like GPT-4 can obtain high accuracy in legal information extraction, highlighting the potential of LLMs in revolutionising the way legal information is processed and utilised, offering significant implications for legal research and practice.

翻译：法庭笔录和判决书是法律知识的宝库，详细记录了案件的复杂细节以及司法判决背后的逻辑。从这些文档中提取关键信息能够提供案件的精要概述，对法律专家和公众都至关重要。随着大语言模型（LLM）的出现，自动信息提取变得愈发可行和高效。本文对应用GPT-4（一种大语言模型）从英国劳动法庭（UKET）案件中自动提取信息进行了全面研究。我们通过人工验证流程细致评估了GPT-4在提取关键信息方面的表现，以确保所提取数据的准确性和相关性。研究围绕两项主要提取任务展开：第一项任务涉及对八个关键方面的通用提取，这些方面对法律专家和普通公众都具有重要意义，包括案件事实、诉求、法律条文引用、先例引用、一般案件结果及对应标签、详细命令与补救措施以及判决理由。第二项任务则更具针对性，旨在分析其中三个提取特征——即事实、诉求和结果——以便开发能够预测劳动法争议结果的工具。通过分析，我们证明GPT-4等大语言模型能够在法律信息提取中实现高精度，突显了大语言模型在革新法律信息处理与利用方式方面的潜力，为法律研究和实践带来了重要启示。