Confusing charge prediction is a challenging task in legal AI, which involves predicting confusing charges based on fact descriptions. While existing charge prediction methods have shown impressive performance, they face significant challenges when dealing with confusing charges, such as Snatch and Robbery. In the legal domain, constituent elements play a pivotal role in distinguishing confusing charges. Constituent elements are fundamental behaviors underlying criminal punishment and have subtle distinctions among charges. In this paper, we introduce a novel From Graph to Word Bag (FWGB) approach, which introduces domain knowledge regarding constituent elements to guide the model in making judgments on confusing charges, much like a judge's reasoning process. Specifically, we first construct a legal knowledge graph containing constituent elements to help select keywords for each charge, forming a word bag. Subsequently, to guide the model's attention towards the differentiating information for each charge within the context, we expand the attention mechanism and introduce a new loss function with attention supervision through words in the word bag. We construct the confusing charges dataset from real-world judicial documents. Experiments demonstrate the effectiveness of our method, especially in maintaining exceptional performance in imbalanced label distributions.
翻译:混淆罪名预测是法律人工智能中的一项挑战性任务,涉及基于事实描述对易混淆罪名(如抢夺罪与抢劫罪)进行预测。现有罪名预测方法虽表现优异,但在处理此类混淆罪名时仍面临重大挑战。在法律领域,构成要件是区分混淆罪名的关键要素。构成要件是刑事处罚的基础行为,在不同罪名间存在细微差异。本文提出一种新颖的“从图到词袋”(FWGB)方法,通过引入构成要件的领域知识引导模型做出类似法官推理过程的判断。具体而言,我们首先构建包含构成要件的法律知识图谱,为各罪名筛选关键词形成词袋;其次,为引导模型聚焦语境内各罪名的区分信息,我们扩展注意力机制,并通过词袋中的词语引入带注意力监督的损失函数。我们从真实司法文书中构建混淆罪名数据集。实验证明,本方法在标签分布不均衡场景下仍能保持卓越性能。