This paper summarizes Team SCaLAR's work on SemEval-2024 Task 5: Legal Argument Reasoning in Civil Procedure. To address this Binary Classification task, which was daunting due to the complexity of the Legal Texts involved, we propose a simple yet novel similarity and distance-based unsupervised approach to generate labels. Further, we explore the Multi-level fusion of Legal-Bert embeddings using ensemble features, including CNN, GRU, and LSTM. To address the lengthy nature of Legal explanation in the dataset, we introduce T5-based segment-wise summarization, which successfully retained crucial information, enhancing the model's performance. Our unsupervised system witnessed a 20-point increase in macro F1-score on the development set and a 10-point increase on the test set, which is promising given its uncomplicated architecture.
翻译:本文总结了SCaLAR团队在SemEval-2024任务5:民事诉讼法律论证推理中的工作。针对因法律文本复杂性而极具挑战性的二分类任务,我们提出了一种简单而新颖的基于相似度与距离的无监督标签生成方法。进一步,我们利用包含CNN、GRU和LSTM的集成特征,探索了Legal-Bert嵌入的多层次融合。为解决数据集中法律解释篇幅冗长的问题,我们引入了基于T5的分段式摘要方法,该方法成功保留了关键信息,从而提升了模型性能。我们的无监督系统在开发集上宏观F1分数提升了20个百分点,在测试集上提升了10个百分点,考虑到其简洁的架构,这一结果令人振奋。