Due to the prohibitively high cost of creating error correction datasets, most Factual Claim Correction methods rely on a powerful verification model to guide the correction process. This leads to a significant drop in performance in domains like scientific claims, where good verification models do not always exist. In this work, we introduce SciFix, a scientific claim correction system that does not require a verifier but can outperform existing methods by a considerable margin -- achieving correction accuracy of 84% on the SciFact dataset, 77% on SciFact-Open and 72% on the CovidFact dataset, compared to next best accuracies of 7%, 5%, and 15% on the same datasets respectively. Our method leverages the power of prompting with LLMs during training to create a richly annotated dataset that can be used for fully supervised training and regularization. We additionally use a claim-aware decoding procedure to improve the quality of corrected claims. Our method outperforms the very LLM that was used to generate the annotated dataset -- with Few-Shot Prompting on GPT3.5 achieving 58%, 61%, and 64% on the respective datasets, a consistently lower correction accuracy, despite using nearly 800 times as many parameters as our model.
翻译:由于创建错误纠正数据集的高昂成本,大多数事实声明纠正方法依赖强大的验证模型来指导纠正过程。这导致在科学声明等领域性能显著下降,因为这些领域并非始终存在良好的验证模型。在本工作中,我们提出SciFix,一种无需验证器的科学声明纠正系统,但能在现有方法基础上大幅提升性能——在SciFact数据集上达到84%的纠正准确率,在SciFact-Open上达到77%,在CovidFact数据集上达到72%,而同一数据集上次优准确率分别为7%、5%和15%。我们的方法利用训练过程中对LLM进行提示的力量,创建富含标注的数据集,可用于全监督训练和正则化。此外,我们采用声明感知解码过程来提高纠正声明的质量。我们的方法优于用于生成标注数据集的LLM本身——尽管GPT3.5少样本提示在相应数据集上分别达到58%、61%和64%的准确率,但其纠正准确率始终较低,且参数量是我们的近800倍。