Confusing charge prediction is a challenging task in legal AI, which involves predicting confusing charges based on fact descriptions. While existing charge prediction methods have shown impressive performance, they face significant challenges when dealing with confusing charges, such as Snatch and Robbery. In the legal domain, constituent elements play a pivotal role in distinguishing confusing charges. Constituent elements are fundamental behaviors underlying criminal punishment and have subtle distinctions among charges. In this paper, we introduce a novel From Graph to Word Bag (FWGB) approach, which introduces domain knowledge regarding constituent elements to guide the model in making judgments on confusing charges, much like a judge's reasoning process. Specifically, we first construct a legal knowledge graph containing constituent elements to help select keywords for each charge, forming a word bag. Subsequently, to guide the model's attention towards the differentiating information for each charge within the context, we expand the attention mechanism and introduce a new loss function with attention supervision through words in the word bag. We construct the confusing charges dataset from real-world judicial documents. Experiments demonstrate the effectiveness of our method, especially in maintaining exceptional performance in imbalanced label distributions.
翻译:混淆罪名预测是法律人工智能中的一项挑战性任务,涉及基于事实描述预测易混淆的罪名。现有罪名预测方法虽表现优异,但在处理如抢夺罪与抢劫罪等混淆罪名时面临显著挑战。在法律领域,构成要件在区分混淆罪名中起关键作用。构成要件是刑事处罚背后的基本行为,在不同罪名间存在微妙差异。本文提出一种新颖的“从图到词袋”(FWGB)方法,通过引入构成要件的领域知识指导模型对混淆罪名进行判断,其推理过程类似于法官的思维。具体而言,我们首先构建包含构成要件的法律知识图谱,帮助为每种罪名筛选关键词以形成词袋。随后,为引导模型在语境中关注每种罪名的区分性信息,我们扩展注意力机制,并通过词袋中的词语引入一种带有注意力监督的新损失函数。我们基于真实司法文书构建了混淆罪名数据集。实验结果表明,该方法具有有效性,尤其在处理不平衡标签分布时仍能保持优异性能。