Enterprises frequently enter into commercial contracts that can serve as vital sources of project-specific requirements. Contractual clauses are obligatory, and the requirements derived from contracts can detail the downstream implementation activities that non-legal stakeholders, including requirement analysts, engineers, and delivery personnel, need to conduct. However, comprehending contracts is cognitively demanding and error-prone for such stakeholders due to the extensive use of Legalese and the inherent complexity of contract language. Furthermore, contracts often contain ambiguously worded clauses to ensure comprehensive coverage. In contrast, non-legal stakeholders require a detailed and unambiguous comprehension of contractual clauses to craft actionable requirements. In this work, we introduce a novel legal NLP task that involves generating clarification questions for contracts. These questions aim to identify contract ambiguities on a document level, thereby assisting non-legal stakeholders in obtaining the necessary details for eliciting requirements. This task is challenged by three core issues: (1) data availability, (2) the length and unstructured nature of contracts, and (3) the complexity of legal text. To address these issues, we propose ConRAP, a retrieval-augmented prompting framework for generating clarification questions to disambiguate contractual text. Experiments conducted on contracts sourced from the publicly available CUAD dataset show that ConRAP with ChatGPT can detect ambiguities with an F2 score of 0.87. 70% of the generated clarification questions are deemed useful by human evaluators.
翻译:企业频繁签署商业合同,这些合同可作为项目特定需求的重要来源。合同条款具有强制性,且源自合同的需求能详细规定下游实施活动,这些活动需要非法律利益相关方(包括需求分析师、工程师和交付人员)执行。然而,由于法律术语的广泛使用及合同语言固有的复杂性,此类利益相关方理解合同具有认知难度且易出错。此外,合同常包含措辞模糊的条款以确保全面覆盖,而非法律利益相关方则需要详细且无歧义地理解合同条款,以制定可操作的需求。本文提出一项新的法律自然语言处理任务,即针对合同生成澄清问题。这些问题旨在从文档级别识别合同歧义,从而帮助非法律利益相关方获取需求引出所需的详细信息。该任务面临三大核心挑战:(1) 数据可用性不足,(2) 合同篇幅长且结构非结构化,(3) 法律文本的复杂性。为解决这些问题,我们提出ConRAP,一种基于检索增强的提示框架,用于生成澄清问题以消除合同文本歧义。在公开CUAD数据集中的合同上进行的实验表明,采用ChatGPT的ConRAP能以0.87的F2分数检测歧义,且70%的生成澄清问题被人类评估者认定为有用。