This report addresses the challenge of limited labeled datasets for developing legal recommender systems, particularly in specialized domains like labor disputes. We propose a new approach leveraging the co-citation of legal articles within cases to establish similarity and enable algorithmic annotation. This method draws a parallel to the concept of case co-citation, utilizing cited precedents as indicators of shared legal issues. To evaluate the labeled results, we employ a system that recommends similar cases based on plaintiffs' accusations, defendants' rebuttals, and points of disputes. The evaluation demonstrates that the recommender, with finetuned text embedding models and a reasonable BiLSTM module can recommend labor cases whose similarity was measured by the co-citation of the legal articles. This research contributes to the development of automated annotation techniques for legal documents, particularly in areas with limited access to comprehensive legal databases.
翻译:本报告针对开发法律推荐系统时标注数据集有限的问题,尤其是在劳动争议等专业领域。我们提出一种新方法,利用案例中法律条款的共引关系来建立相似性并实现算法标注。该方法借鉴了案例共引的概念,将引用的先例作为共享法律问题的指标。为了评估标注结果,我们采用了一个基于原告指控、被告辩驳及争议焦点的相似案例推荐系统进行评估。评估表明,通过微调的文本嵌入模型与合理的BiLSTM模块,该推荐系统能够推荐其相似性由法律条款共引度衡量的劳动争议案例。本研究有助于推进法律文档的自动标注技术发展,特别是在难以获取全面法律数据库的领域。