Legal charge prediction, an essential task in legal AI, seeks to assign accurate charge labels to case descriptions, attracting significant recent interest. Existing methods primarily employ diverse neural network structures for modeling case descriptions directly, failing to effectively leverage multi-source external knowledge. We propose a prompt learning framework-based method that simultaneously leverages multi-source heterogeneous external knowledge from a legal knowledge base, a conversational LLM, and related legal articles. Specifically, we match knowledge snippets in case descriptions via the legal knowledge base and encapsulate them into the input through a hard prompt template. Additionally, we retrieve legal articles related to a given case description through contrastive learning, and then obtain factual elements within the case description through a conversational LLM. We fuse the embedding vectors of soft prompt tokens with the encoding vector of factual elements to achieve knowledge-enhanced model forward inference. Experimental results show that our method achieved state-of-the-art results on CAIL-2018, the largest legal charge prediction dataset, and our method has lower data dependency. Case studies also demonstrate our method's strong interpretability.
翻译:法律罪名预测是法律人工智能中的一项核心任务,旨在为案件描述分配准确的罪名标签,近年来引起了广泛关注。现有方法主要采用多样化的神经网络结构直接对案件描述进行建模,未能有效利用多源外部知识。我们提出一种基于提示学习框架的方法,该方法同时利用来自法律知识库、对话式大语言模型以及相关法律条文的多源异质外部知识。具体而言,我们通过法律知识库匹配案件描述中的知识片段,并通过硬提示模板将其封装到输入中。此外,我们通过对比学习检索与给定案件描述相关的法律条文,然后通过对话式大语言模型获取案件描述中的事实要素。我们将软提示标记的嵌入向量与事实要素的编码向量相融合,以实现知识增强的模型前向推理。实验结果表明,我们的方法在最大的法律罪名预测数据集CAIL-2018上取得了最先进的结果,并且我们的方法具有较低的数据依赖性。案例研究也证明了我们的方法具有很强的可解释性。