Terms of Service (ToS) form an integral part of any agreement as it defines the legal relationship between a service provider and an end-user. Not only do they establish and delineate reciprocal rights and responsibilities, but they also provide users with information on essential aspects of contracts that pertain to the use of digital spaces. These aspects include a wide range of topics, including limitation of liability, data protection, etc. Users tend to accept the ToS without going through it before using any application or service. Such ignorance puts them in a potentially weaker situation in case any action is required. Existing methodologies for the detection or classification of unfair clauses are however obsolete and show modest performance. In this research paper, we present SOTA(State of The Art) results on unfair clause detection from ToS documents based on unprecedented custom BERT Fine-tuning in conjunction with SVC(Support Vector Classifier). The study shows proficient performance with a macro F1-score of 0.922 at unfair clause detection, and superior performance is also shown in the classification of unfair clauses by each tag. Further, a comparative analysis is performed by answering research questions on the Transformer models utilized. In order to further research and experimentation the code and results are made available on https://github.com/batking24/Unfair-TOS-An-Automated-Approach-based-on-Fine-tuning-BERT-in-conjunction-with-ML.
翻译:服务条款(ToS)构成任何协议的核心组成部分,它界定了服务提供商与终端用户之间的法律关系。这些条款不仅确立并界定了双方的权利与义务,还为用户提供了有关数字空间使用相关合同关键信息(包括责任限制、数据保护等广泛议题)。用户在未仔细阅读服务条款的情况下便倾向于直接接受,这种疏忽行为导致其在需要采取法律行动时处于潜在弱势地位。现有不公平条款检测或分类方法存在技术陈旧、性能欠佳的问题。本研究提出基于定制化BERT微调(Fine-tuning)与支持向量分类器(SVC)协同的SOTA(最先进)方法,在服务条款文档中实现不公平条款检测。该方法展现出卓越性能:不公平条款检测的宏F1分数达0.922,同时在各标签分类任务中也表现优异。此外,通过回答基于Transformer模型的相关研究问题进行了对比分析。为促进后续研究与实验,相关代码与结果已发布于https://github.com/batking24/Unfair-TOS-An-Automated-Approach-based-on-Fine-tuning-BERT-in-conjunction-with-ML。