Terms of Service (ToS) form an integral part of any agreement as it defines the legal relationship between a service provider and an end-user. Not only do they establish and delineate reciprocal rights and responsibilities, but they also provide users with information on essential aspects of contracts that pertain to the use of digital spaces. These aspects include a wide range of topics, including limitation of liability, data protection, etc. Users tend to accept the ToS without going through it before using any application or service. Such ignorance puts them in a potentially weaker situation in case any action is required. Existing methodologies for the detection or classification of unfair clauses are however obsolete and show modest performance. In this research paper, we present SOTA(State of The Art) results on unfair clause detection from ToS documents based on unprecedented Fine-tuning BERT in integration with SVC(Support Vector Classifier). The study shows proficient performance with a macro F1-score of 0.922 at unfair clause detection, and superior performance is also shown in the classification of unfair clauses by each tag. Further, a comparative analysis is performed by answering research questions on the Transformer models utilized. In order to further research and experimentation the code and results are made available on https://github.com/batking24/Unfair-TOS-An-Automated-Approach-based-on-Fine-tuning-BERT-in-conjunction-with-ML.
翻译:服务条款(Terms of Service, ToS)构成任何协议的核心组成部分,它界定了服务提供商与终端用户之间的法律关系。条款不仅确立和规定了双方的权利与义务,还为用户提供了与数字空间使用相关的合同关键信息,涵盖责任限制、数据保护等多个主题。然而,用户在使用任何应用或服务前往往未经阅读便直接接受服务条款,这种忽视行为在需要采取行动时可能使其处于不利地位。现有不公平条款检测或分类方法已显陈旧且性能有限。本论文提出基于微调BERT与支持向量分类器(SVC)集成的新方法,在服务条款不公平条款检测任务上取得了当前最优(SOTA)结果。实验表明,该方法在不公平条款检测任务中宏平均F1分数达0.922,同时在按标签分类不公平条款时展现出卓越性能。此外,本研究通过回答关于所用Transformer模型的研究问题进行了比较分析。为促进后续研究与实验,相关代码及结果已公开于https://github.com/batking24/Unfair-TOS-An-Automated-Approach-based-on-Fine-tuning-BERT-in-conjunction-with-ML。