This study addresses the growing concern of information asymmetry in consumer contracts, exacerbated by the proliferation of online services with complex Terms of Service that are rarely even read. Even though research on automatic analysis methods is conducted, the problem is aggravated by the general focus on English-language Machine Learning approaches and on major jurisdictions, such as the European Union. We introduce a new methodology and a substantial dataset addressing this gap. We propose a novel annotation scheme with four categories and a total of 20 classes, and apply it on 50 online Terms of Service used in Chile. Our evaluation of transformer-based models highlights how factors like language- and/or domain-specific pre-training, few-shot sample size, and model architecture affect the detection and classification of potentially abusive clauses. Results show a large variability in performance for the different tasks and models, with the highest macro-F1 scores for the detection task ranging from 79% to 89% and micro-F1 scores up to 96%, while macro-F1 scores for the classification task range from 60% to 70% and micro-F1 scores from 64% to 80%. Notably, this is the first Spanish-language multi-label classification dataset for legal clauses, applying Chilean law and offering a comprehensive evaluation of Spanish-language models in the legal domain. Our work lays the ground for future research in method development for rarely considered legal analysis and potentially leads to practical applications to support consumers in Chile and Latin America as a whole.
翻译:本研究针对消费者合同中日益严重的信息不对称问题,该问题因在线服务激增及其复杂且极少被阅读的服务条款而加剧。尽管已有关于自动分析方法的研究,但由于现有方法普遍聚焦于英语机器学习技术及欧盟等主要司法管辖区,该问题进一步恶化。我们引入了一种新方法并构建了大规模数据集以填补这一空白。我们提出了一种包含四个类别共20个分类的新型标注方案,并将其应用于智利使用的50份在线服务条款。基于Transformer模型的评估结果表明,语言和/或领域特定的预训练、少样本规模及模型架构等因素如何影响潜在滥用条款的检测与分类。结果显示不同任务和模型的性能存在显著差异:检测任务的宏观F1分数最高为79%至89%,微观F1分数最高达96%;而分类任务的宏观F1分数为60%至70%,微观F1分数为64%至80%。值得注意的是,这是首个基于智利法律构建的西班牙语法律条款多标签分类数据集,并对西班牙语模型在法律领域的表现进行了全面评估。我们的工作为未来在鲜少涉及的法律分析方法开发奠定了基础,并有望为智利乃至整个拉丁美洲的消费者提供实际应用支持。