Rethinking Legal Judgement Prediction in a Realistic Scenario in the Era of Large Language Models

This study investigates judgment prediction in a realistic scenario within the context of Indian judgments, utilizing a range of transformer-based models, including InLegalBERT, BERT, and XLNet, alongside LLMs such as Llama-2 and GPT-3.5 Turbo. In this realistic scenario, we simulate how judgments are predicted at the point when a case is presented for a decision in court, using only the information available at that time, such as the facts of the case, statutes, precedents, and arguments. This approach mimics real-world conditions, where decisions must be made without the benefit of hindsight, unlike retrospective analyses often found in previous studies. For transformer models, we experiment with hierarchical transformers and the summarization of judgment facts to optimize input for these models. Our experiments with LLMs reveal that GPT-3.5 Turbo excels in realistic scenarios, demonstrating robust performance in judgment prediction. Furthermore, incorporating additional legal information, such as statutes and precedents, significantly improves the outcome of the prediction task. The LLMs also provide explanations for their predictions. To evaluate the quality of these predictions and explanations, we introduce two human evaluation metrics: Clarity and Linking. Our findings from both automatic and human evaluations indicate that, despite advancements in LLMs, they are yet to achieve expert-level performance in judgment prediction and explanation tasks.

翻译：本研究在印度判决背景下，探讨现实场景中的判决预测问题，使用了包括InLegalBERT、BERT和XLNet在内的多种基于Transformer的模型，以及Llama-2和GPT-3.5 Turbo等大语言模型。在此现实场景中，我们模拟案件提交法庭裁决时的判决预测过程，仅利用当时可获取的信息——如案件事实、法规、判例和辩论要点。这种方法模拟了现实世界条件，即决策必须在缺乏事后认知的情况下作出，这与以往研究中常见的回顾性分析形成鲜明对比。对于Transformer模型，我们尝试采用分层Transformer架构并对判决事实进行摘要处理，以优化模型输入。我们的大语言模型实验表明，GPT-3.5 Turbo在现实场景中表现卓越，展现出稳健的判决预测能力。此外，引入法规和判例等额外法律信息能显著提升预测任务的效果。大语言模型还能为其预测提供解释。为评估这些预测和解释的质量，我们引入了两个人工评估指标：清晰度与关联性。通过自动评估和人工评估的结果表明，尽管大语言模型取得进展，但在判决预测和解释任务方面尚未达到专家级水平。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日