The growth of pending legal cases in populous countries, such as India, has become a major issue. Developing effective techniques to process and understand legal documents is extremely useful in resolving this problem. In this paper, we present our systems for SemEval-2023 Task 6: understanding legal texts (Modi et al., 2023). Specifically, we first develop the Legal-BERT-HSLN model that considers the comprehensive context information in both intra- and inter-sentence levels to predict rhetorical roles (subtask A) and then train a Legal-LUKE model, which is legal-contextualized and entity-aware, to recognize legal entities (subtask B). Our evaluations demonstrate that our designed models are more accurate than baselines, e.g., with an up to 15.0% better F1 score in subtask B. We achieved notable performance in the task leaderboard, e.g., 0.834 micro F1 score, and ranked No.5 out of 27 teams in subtask A.
翻译:在印度等人满为患的国家,待审案件数量的增长已成为一个主要问题。开发有效处理和理解法律文档的技术对于解决这一问题极为有用。本文介绍了我们在SemEval-2023任务6(法律文本理解)中构建的系统(Modi等人,2023年)。具体而言,我们首先开发了Legal-BERT-HSLN模型,该模型考虑句子内和句子间的全面上下文信息,用于预测修辞角色(子任务A);随后训练了Legal-LUKE模型,该模型具备法律语境化能力且对实体敏感,用于识别法律实体(子任务B)。评估结果表明,我们设计的模型比基线模型更准确,例如在子任务B中F1分数最高提升15.0%。我们在任务排行榜上取得了显著成绩,例如子任务A中微平均F1分数达0.834,在27支参赛队伍中排名第5。