This paper introduces a Korean legal judgment prediction (LJP) dataset for insurance disputes. Successful LJP models on insurance disputes can benefit insurance companies and their customers. It can save both sides' time and money by allowing them to predict how the result would come out if they proceed to the dispute mediation process. As is often the case with low-resource languages, there is a limitation on the amount of data available for this specific task. To mitigate this issue, we investigate how one can achieve a good performance despite the limitation in data. In our experiment, we demonstrate that Sentence Transformer Fine-tuning (SetFit, Tunstall et al., 2022) is a good alternative to standard fine-tuning when training data are limited. The models fine-tuned with the SetFit approach on our data show similar performance to the Korean LJP benchmark models (Hwang et al., 2022) despite the much smaller data size.
翻译:本文介绍了一个面向保险纠纷的韩语法律判决预测(LJP)数据集。针对保险纠纷的成功LJP模型可为保险公司及其客户带来益处,通过预测进入纠纷调解程序后的可能结果,为双方节省时间和金钱。与其他低资源语言的常见情况类似,该特定任务可用的数据量存在局限性。为缓解这一问题,我们研究了如何在数据受限的情况下仍取得良好性能。实验表明,当训练数据有限时,句子变换器微调(SetFit, Tunstall等人,2022)是标准微调方法的理想替代方案。采用SetFit方法在我们的数据上微调的模型,尽管数据规模小得多,其性能仍与韩语LJP基准模型(Hwang等人,2022)相当。