In case law, the precedents are the relevant cases that are used to support the decisions made by the judges and the opinions of lawyers towards a given case. This relevance is referred to as the case-to-case reference relation. To efficiently find relevant cases from a large case pool, retrieval tools are widely used by legal practitioners. Existing legal case retrieval models mainly work by comparing the text representations of individual cases. Although they obtain a decent retrieval accuracy, the intrinsic case connectivity relationships among cases have not been well exploited for case encoding, therefore limiting the further improvement of retrieval performance. In a case pool, there are three types of case connectivity relationships: the case reference relationship, the case semantic relationship, and the case legal charge relationship. Due to the inductive manner in the task of legal case retrieval, using case reference as input is not applicable for testing. Thus, in this paper, a CaseLink model based on inductive graph learning is proposed to utilise the intrinsic case connectivity for legal case retrieval, a novel Global Case Graph is incorporated to represent both the case semantic relationship and the case legal charge relationship. A novel contrastive objective with a regularisation on the degree of case nodes is proposed to leverage the information carried by the case reference relationship to optimise the model. Extensive experiments have been conducted on two benchmark datasets, which demonstrate the state-of-the-art performance of CaseLink. The code has been released on https://github.com/yanran-tang/CaseLink.
翻译:在判例法中,先例是指用于支持法官对特定案件所作裁决以及律师意见的相关案例。这种相关性被称为案例间的引用关系。为了从大规模案例库中高效查找相关案例,法律从业者广泛使用检索工具。现有的法律案例检索模型主要通过比较单个案例的文本表征进行工作。尽管这些模型获得了不错的检索准确率,但案例间固有的连接关系尚未在案例编码中得到充分利用,从而限制了检索性能的进一步提升。在案例库中,存在三种类型的案例连接关系:案例引用关系、案例语义关系和案例罪名关系。由于法律案例检索任务具有归纳特性,将案例引用关系作为输入不适用于测试场景。因此,本文提出了一种基于归纳图学习的CaseLink模型,以利用案例固有连接关系进行法律案例检索。该模型引入了一种新颖的全局案例图来同时表征案例语义关系和案例罪名关系。本文还提出了一种结合案例节点度正则化的新型对比学习目标,以利用案例引用关系所承载的信息优化模型。在两个基准数据集上进行的广泛实验表明,CaseLink模型取得了最先进的性能。代码已发布于 https://github.com/yanran-tang/CaseLink。