SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models - 专知论文

会员服务 ·

0

语言模型化 · 黑盒 · MoDELS · SimPLe · GPT-3 ·

2023 年 3 月 15 日

SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models

翻译：SelfCheckGPT：生成式大语言模型的零资源黑箱幻觉检测

Potsawee Manakul,Adian Liusie,Mark J. F. Gales

Generative Large Language Models (LLMs) such as GPT-3 are capable of generating highly fluent responses to a wide variety of user prompts. However, LLMs are known to hallucinate facts and make non-factual statements which can undermine trust in their output. Existing fact-checking approaches either require access to token-level output probability distribution (which may not be available for systems such as ChatGPT) or external databases that are interfaced via separate, often complex, modules. In this work, we propose "SelfCheckGPT", a simple sampling-based approach that can be used to fact-check black-box models in a zero-resource fashion, i.e. without an external database. SelfCheckGPT leverages the simple idea that if a LLM has knowledge of a given concept, sampled responses are likely to be similar and contain consistent facts. However, for hallucinated facts, stochastically sampled responses are likely to diverge and contradict one another. We investigate this approach by using GPT-3 to generate passages about individuals from the WikiBio dataset, and manually annotate the factuality of the generated passages. We demonstrate that SelfCheckGPT can: i) detect non-factual and factual sentences; and ii) rank passages in terms of factuality. We compare our approach to several existing baselines and show that in sentence hallucination detection, our approach has AUC-PR scores comparable to grey-box methods, while SelfCheckGPT is best at passage factuality assessment.

翻译：生成式大型语言模型（如GPT-3）能够针对各类用户提示生成高度流畅的回复。然而，已知此类模型会编造事实并做出不实陈述，这削弱了其输出的可信度。现有的事实核查方法要么需要访问模型输出的词元级概率分布（对于ChatGPT等系统可能无法获取），要么依赖通过独立且常为复杂的模块进行接口的外部数据库。本文提出"SelfCheckGPT"，一种基于采样的简洁方法，可在零资源条件下（即无需外部数据库）对黑箱模型进行事实核查。该方法基于一个简单思想：若大语言模型掌握某一概念，其采样生成的回复应具有相似性并包含一致的事实；而对于编造的事实，随机采样的回复则可能相互偏离并产生矛盾。我们通过使用GPT-3生成关于WikiBio数据集中人物的段落，并人工标注生成段落的事实性，对该方法进行验证。结果表明，SelfCheckGPT能够：i）检测不实与事实性句子；ii）按事实性对段落排序。相较于现有多个基线方法，在句子级幻觉检测任务中，我们的方法与灰箱方法获得相近的AUC-PR分数，而SelfCheckGPT在段落事实性评估中表现最优。

0

相关内容

语言模型化

语言模型化

【Meta AI】多模态理解研究进展，Advances in multimodal understanding research at Meta AI

【Meta AI】多模态理解研究进展，Advances in multimodal understanding research at Meta AI

专知会员服务

68+阅读 · 2022年3月20日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

29+阅读 · 2019年11月8日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

80+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【数据集】新的YELP数据集官方下载

【数据集】新的YELP数据集官方下载

机器学习研究会

16+阅读 · 2017年8月31日

内质网应激IRE1－XBP1S通路在高糖引起肾脏及系膜细胞发生氧化应激及损伤中的机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

p53基因突变促进Wilms 肿瘤发展转移的小鼠动物模型研究

国家自然科学基金

0+阅读 · 2014年12月31日

hTERT基因多态性影响动脉粥样硬化形成机制及民族差异性研究

国家自然科学基金

0+阅读 · 2014年12月31日

从RAAS激活通路探讨壮药壮通饮干预冠心病心肌缺血血瘀模型大鼠的分子靶点

国家自然科学基金

0+阅读 · 2014年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

石墨烯材料的太赫兹响应

国家自然科学基金

0+阅读 · 2012年12月31日

新疆维吾尔族精神分裂症新发生的拷贝数变异（de novo CNV）研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于TEM结构化数据库的空管人误风险量化研究

国家自然科学基金

0+阅读 · 2009年12月31日

微型化CE－AD/C4D系统在常见代谢病诊断中的方法研究

国家自然科学基金

0+阅读 · 2008年12月31日

华北克拉通橄榄岩包体的显微构造、含水量与地震波性质

国家自然科学基金

0+阅读 · 2008年12月31日

ChatUniTest: a ChatGPT-based automated unit test generation tool

Arxiv

1+阅读 · 2023年5月8日

Regulating ChatGPT and other Large Generative AI Models

Arxiv

0+阅读 · 2023年5月8日

Prompted LLMs as Chatbot Modules for Long Open-domain Conversation

Arxiv

0+阅读 · 2023年5月8日

No More Manual Tests? Evaluating and Improving ChatGPT for Unit Test Generation

Arxiv

0+阅读 · 2023年5月7日

On the Usage of Continual Learning for Out-of-Distribution Generalization in Pre-trained Language Models of Code

Arxiv

0+阅读 · 2023年5月6日

Self-Edit: Fault-Aware Code Editor for Code Generation

Arxiv

0+阅读 · 2023年5月6日

Data Station: Delegated, Trustworthy, and Auditable Computation to Enable Data-Sharing Consortia with a Data Escrow

Arxiv

0+阅读 · 2023年5月5日

Large Language Models for Code: Security Hardening and Adversarial Testing

Arxiv

0+阅读 · 2023年5月5日

An Adaptive Benchmark for Modeling User Exploration of Large Datasets

Arxiv

0+阅读 · 2023年5月5日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

VIP会员

文章信息

相关主题

语言模型化

最新内容

五角大楼新设无人机办公室（DRPM-UxS）将如何重塑美国无人系统格局（附美国防部设立备忘录）

五角大楼新设无人机办公室（DRPM-UxS）将如何重塑美国无人系统格局（附美国防部设立备忘录）

专知会员服务

0+阅读 · 今天8:28

印度精确打击与指挥架构的断层

印度精确打击与指挥架构的断层

专知会员服务

4+阅读 · 7月20日

《NASA喷气推进实验室：高耐久轻质常驻空观测系统（HELIOS）》429页

《NASA喷气推进实验室：高耐久轻质常驻空观测系统（HELIOS）》429页

专知会员服务

6+阅读 · 7月20日

美空军AI完成F-16战斗机自主空战历史性试飞

美空军AI完成F-16战斗机自主空战历史性试飞

专知会员服务

6+阅读 · 7月20日

《美政府问责局——武器系统年度评估（2026年）：强制要求成熟技术或可推动转向快速交付》249页

《美政府问责局——武器系统年度评估（2026年）：强制要求成熟技术或可推动转向快速交付》249页

专知会员服务

6+阅读 · 7月20日

《美国陆军：通过弹性分布式模型库实现自适应AI优势》

《美国陆军：通过弹性分布式模型库实现自适应AI优势》

专知会员服务

4+阅读 · 7月20日

博士论文 | 理解与改进大语言模型推理：从反转诅咒到连续思维链

博士论文 | 理解与改进大语言模型推理：从反转诅咒到连续思维链

专知会员服务

7+阅读 · 7月20日

综述 | 终身视觉表征：持续自监督学习CSSL系统综述

综述 | 终身视觉表征：持续自监督学习CSSL系统综述

专知会员服务

7+阅读 · 7月20日

深入Project Maven：为何人工智能在战场上依然失灵

深入Project Maven：为何人工智能在战场上依然失灵

专知会员服务

14+阅读 · 7月19日

锻造未来士兵：外骨骼、基因工程与赛博格

锻造未来士兵：外骨骼、基因工程与赛博格

专知会员服务

7+阅读 · 7月19日

《无人机系统（UAS）通信网状网络试验性部署》50页报告

《无人机系统（UAS）通信网状网络试验性部署》50页报告

专知会员服务

10+阅读 · 7月19日

《无人机蜂群通信技术研究》50页

《无人机蜂群通信技术研究》50页

专知会员服务

11+阅读 · 7月19日

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

专知会员服务

16+阅读 · 7月18日

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

专知会员服务

8+阅读 · 7月18日

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

专知会员服务

17+阅读 · 7月18日

相关VIP内容

【Meta AI】多模态理解研究进展，Advances in multimodal understanding research at Meta AI

【Meta AI】多模态理解研究进展，Advances in multimodal understanding research at Meta AI

专知会员服务

68+阅读 · 2022年3月20日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

29+阅读 · 2019年11月8日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

80+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

印度精确打击与指挥架构的断层

美空军AI完成F-16战斗机自主空战历史性试飞

五角大楼新设无人机办公室（DRPM-UxS）将如何重塑美国无人系统格局（附美国防部设立备忘录）

《NASA喷气推进实验室：高耐久轻质常驻空观测系统（HELIOS）》429页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【数据集】新的YELP数据集官方下载

【数据集】新的YELP数据集官方下载

机器学习研究会

16+阅读 · 2017年8月31日

相关论文

ChatUniTest: a ChatGPT-based automated unit test generation tool

Arxiv

1+阅读 · 2023年5月8日

Regulating ChatGPT and other Large Generative AI Models

Arxiv

0+阅读 · 2023年5月8日

Prompted LLMs as Chatbot Modules for Long Open-domain Conversation

Arxiv

0+阅读 · 2023年5月8日

No More Manual Tests? Evaluating and Improving ChatGPT for Unit Test Generation

Arxiv

0+阅读 · 2023年5月7日

On the Usage of Continual Learning for Out-of-Distribution Generalization in Pre-trained Language Models of Code

Arxiv

0+阅读 · 2023年5月6日

Self-Edit: Fault-Aware Code Editor for Code Generation

Arxiv

0+阅读 · 2023年5月6日

Data Station: Delegated, Trustworthy, and Auditable Computation to Enable Data-Sharing Consortia with a Data Escrow

Arxiv

0+阅读 · 2023年5月5日

Large Language Models for Code: Security Hardening and Adversarial Testing

Arxiv

0+阅读 · 2023年5月5日

An Adaptive Benchmark for Modeling User Exploration of Large Datasets

Arxiv

0+阅读 · 2023年5月5日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

相关基金

内质网应激IRE1－XBP1S通路在高糖引起肾脏及系膜细胞发生氧化应激及损伤中的机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

p53基因突变促进Wilms 肿瘤发展转移的小鼠动物模型研究

国家自然科学基金

0+阅读 · 2014年12月31日

hTERT基因多态性影响动脉粥样硬化形成机制及民族差异性研究

国家自然科学基金

0+阅读 · 2014年12月31日

从RAAS激活通路探讨壮药壮通饮干预冠心病心肌缺血血瘀模型大鼠的分子靶点

国家自然科学基金

0+阅读 · 2014年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

石墨烯材料的太赫兹响应

国家自然科学基金

0+阅读 · 2012年12月31日

新疆维吾尔族精神分裂症新发生的拷贝数变异（de novo CNV）研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于TEM结构化数据库的空管人误风险量化研究

国家自然科学基金

0+阅读 · 2009年12月31日

微型化CE－AD/C4D系统在常见代谢病诊断中的方法研究

国家自然科学基金

0+阅读 · 2008年12月31日

华北克拉通橄榄岩包体的显微构造、含水量与地震波性质

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员