A patent must be deemed novel and non-obvious in order to be granted by the US Patent Office (USPTO). If it is not, a US patent examiner will cite the prior work, or prior art, that invalidates the novelty and issue a non-final rejection. Predicting what claims of the invention should change given the prior art is an essential and crucial step in securing invention rights, yet has not been studied before as a learnable task. In this work we introduce the PatentEdits dataset, which contains 105K examples of successful revisions that overcome objections to novelty. We design algorithms to label edits sentence by sentence, then establish how well these edits can be predicted with large language models (LLMs). We demonstrate that evaluating textual entailment between cited references and draft sentences is especially effective in predicting which inventive claims remained unchanged or are novel in relation to prior art.
翻译:一项专利必须被认定具有新颖性和非显而易见性,才能获得美国专利商标局(USPTO)的授权。若不符合条件,美国专利审查员将引用现有技术(prior art)作为否定其新颖性的依据,并发出非最终驳回意见。在现有技术已知的情况下,预测发明权利要求应如何修改是确保发明权利的关键步骤,但此前尚未有研究将其作为可学习的任务进行探索。本研究提出了PatentEdits数据集,其中包含10.5万个成功克服新颖性质疑的修改案例。我们设计了逐句标注修改内容的算法,并基于大型语言模型(LLMs)评估了这些修改的可预测性。实验表明,通过评估引用文献与草案句子之间的文本蕴含关系,能够有效预测哪些发明权利要求相对于现有技术保持不变或具有新颖性。