In last decades, legal case search has received more and more attention. Legal practitioners need to work or enhance their efficiency by means of class case search. In the process of searching, legal practitioners often need the search results under several different causes of cases as reference. However, existing work tends to focus on the relevance of the judgments themselves, without considering the connection between the causes of action. Several well-established diversity search techniques already exist in open-field search efforts. However, these techniques do not take into account the specificity of legal search scenarios, e.g., the subtopic may not be independent of each other, but somehow connected. Therefore, we construct a diversity legal retrieval model. This model takes into account both diversity and relevance, and is well adapted to this scenario. At the same time, considering the lack of dataset with diversity labels, we constructed a diversity legal retrieval dataset and obtained labels by manual labeling. experiments confirmed that our model is effective.
翻译:在过去几十年中,法律案例检索受到了越来越多的关注。法律从业者需要通过类案检索来提高工作效率。在检索过程中,法律从业者通常需要多个不同案由下的检索结果作为参考。然而,现有研究往往侧重于判决书本身的相关性,而未考虑案由之间的关联性。在开放领域检索研究中已经存在若干成熟的多样性检索技术,但这些技术并未考虑法律检索场景的特殊性,例如子主题可能并非相互独立,而是存在某种关联。因此,我们构建了一种兼顾多样性与相关性的法律多样性检索模型,该模型能够很好地适配这一场景。同时,针对缺乏带有多样性标签的数据集的问题,我们构建了一个法律多样性检索数据集,并通过人工标注获取标签。实验证实了模型的有效性。