Strong hallucinations from negation and how to fix them - 专知论文

会员服务 ·

0

Performer · 潜在 · MoDELS · 表示 · 操作 ·

2024 年 8 月 20 日

Strong hallucinations from negation and how to fix them

翻译：否定引发的强幻觉及其修正方法

Nicholas Asher,Swarnadeep Bhar

from arxiv, Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Findings)

Despite great performance on many tasks, language models (LMs) still struggle with reasoning, sometimes providing responses that cannot possibly be true because they stem from logical incoherence. We call such responses \textit{strong hallucinations} and prove that they follow from an LM's computation of its internal representations for logical operators and outputs from those representations. Focusing on negation, we provide a novel solution in which negation is treated not as another element of a latent representation, but as \textit{an operation over an LM's latent representations that constrains how they may evolve}. We show that our approach improves model performance in cloze prompting and natural language inference tasks with negation without requiring training on sparse negative data.

翻译：尽管语言模型在许多任务上表现出色，但其推理能力仍存在不足，有时会因逻辑不一致而产生根本不可能成立的回答。我们将此类回答称为\textit{强幻觉}，并证明其源于语言模型对逻辑运算符内部表征的计算以及基于这些表征的输出。针对否定现象，我们提出了一种创新解决方案：否定不再被视为潜在表征的构成元素，而是作为\textit{作用于语言模型潜在表征的运算操作，用以约束这些表征的演化方式}。研究表明，该方法无需依赖稀疏的否定数据训练，即可在含否定结构的完形填空提示与自然语言推理任务中提升模型性能。

0

相关内容

Performer

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【ACL2020】多模态信息抽取，365页ppt

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

学习自然语言处理路线图

学习自然语言处理路线图

专知会员服务

140+阅读 · 2019年9月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

可解释的CNN

可解释的CNN

CreateAMind

18+阅读 · 2017年10月5日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

47+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

PPP项目争端谈判及其治理机制研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于决策模型和预备电位的运动想象BCI研究

国家自然科学基金

3+阅读 · 2015年12月31日

哈密尔顿系统及多体问题的周期解

国家自然科学基金

0+阅读 · 2014年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

复杂多元数据的半参数统计推断

国家自然科学基金

5+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

概率抽样设计及其统计推断方法

国家自然科学基金

6+阅读 · 2014年12月31日

Rotated Runtime Smooth: Training-Free Activation Smoother for accurate INT4 inference

Arxiv

0+阅读 · 2024年9月30日

A Bayesian framework to evaluate evidence in cases of alleged cheating with secret codes in sports

Arxiv

0+阅读 · 2024年9月30日

Quantum delegated and federated learning via quantum homomorphic encryption

Arxiv

0+阅读 · 2024年9月28日

Enhancing elusive clues in knowledge learning by contrasting attention of language models

Arxiv

0+阅读 · 2024年9月26日

A method for identifying causality in the response of nonlinear dynamical systems

Arxiv

0+阅读 · 2024年9月26日

The poison of dimensionality

Arxiv

0+阅读 · 2024年9月25日

A random measure approach to reinforcement learning in continuous time

Arxiv

0+阅读 · 2024年9月25日

Geometric multimodal representation learning

Arxiv

69+阅读 · 2022年9月7日

A psychological theory of explainability

Arxiv

16+阅读 · 2022年5月17日

How to represent part-whole hierarchies in a neural network

Arxiv

13+阅读 · 2021年2月25日

VIP会员

文章信息

相关主题

最新内容

全面的反无人机系统培训计划

全面的反无人机系统培训计划

专知会员服务

0+阅读 · 34分钟前

数字孪生在军事领域的应用综述：陆地、海上、空中、太空和网络空间多域赋能

数字孪生在军事领域的应用综述：陆地、海上、空中、太空和网络空间多域赋能

专知会员服务

2+阅读 · 52分钟前

《美国首席数字与人工智能办公室（CDAO）人工智能治理与采办流程效能评估》报告

《美国首席数字与人工智能办公室（CDAO）人工智能治理与采办流程效能评估》报告

专知会员服务

5+阅读 · 今天3:36

算法战加速推进：五角大楼项目、供应商生态体系与军事创新的战略重塑

算法战加速推进：五角大楼项目、供应商生态体系与军事创新的战略重塑

专知会员服务

3+阅读 · 今天3:23

探秘Palantir：驱动美情报的科技巨头

探秘Palantir：驱动美情报的科技巨头

专知会员服务

2+阅读 · 今天3:14

《从技术突破到战场应用：发挥原型开发效能的最佳实践》报告

《从技术突破到战场应用：发挥原型开发效能的最佳实践》报告

专知会员服务

4+阅读 · 今天3:09

《美国海军军事海运司令部 2026年手册》

《美国海军军事海运司令部 2026年手册》

专知会员服务

3+阅读 · 今天3:05

别再只盯着“杀手机器人”了：人工智能真正变革现代战争的三种方式

别再只盯着“杀手机器人”了：人工智能真正变革现代战争的三种方式

专知会员服务

3+阅读 · 今天2:36

《人工智能使能系统可靠性框架》

《人工智能使能系统可靠性框架》

专知会员服务

6+阅读 · 今天2:28

2026“人工智能+”行业发展蓝皮书（附下载）

2026“人工智能+”行业发展蓝皮书（附下载）

专知会员服务

15+阅读 · 4月26日

《强化学习数学基础》

《强化学习数学基础》

专知会员服务

12+阅读 · 4月26日

何为下一代指挥与控制？美陆军选择第四步兵师进行快速原型NGC2开发

何为下一代指挥与控制？美陆军选择第四步兵师进行快速原型NGC2开发

专知会员服务

7+阅读 · 4月26日

《低成本自杀式无人机战争的军事战略影响：以乌克兰和伊朗为案例研究》

《低成本自杀式无人机战争的军事战略影响：以乌克兰和伊朗为案例研究》

专知会员服务

9+阅读 · 4月26日

深入Maven智能系统：Palantir基于Claude打造的军事大脑

深入Maven智能系统：Palantir基于Claude打造的军事大脑

专知会员服务

13+阅读 · 4月26日

“Maven计划”的发展演变之“Maven智能系统”应用

“Maven计划”的发展演变之“Maven智能系统”应用

专知会员服务

10+阅读 · 4月26日

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【ACL2020】多模态信息抽取，365页ppt

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

学习自然语言处理路线图

学习自然语言处理路线图

专知会员服务

140+阅读 · 2019年9月24日

热门VIP内容

开通专知VIP会员享更多权益服务

数字孪生在军事领域的应用综述：陆地、海上、空中、太空和网络空间多域赋能

算法战加速推进：五角大楼项目、供应商生态体系与军事创新的战略重塑

全面的反无人机系统培训计划

《美国首席数字与人工智能办公室（CDAO）人工智能治理与采办流程效能评估》报告

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

可解释的CNN

可解释的CNN

CreateAMind

18+阅读 · 2017年10月5日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

相关论文

Rotated Runtime Smooth: Training-Free Activation Smoother for accurate INT4 inference

Arxiv

0+阅读 · 2024年9月30日

A Bayesian framework to evaluate evidence in cases of alleged cheating with secret codes in sports

Arxiv

0+阅读 · 2024年9月30日

Quantum delegated and federated learning via quantum homomorphic encryption

Arxiv

0+阅读 · 2024年9月28日

Enhancing elusive clues in knowledge learning by contrasting attention of language models

Arxiv

0+阅读 · 2024年9月26日

A method for identifying causality in the response of nonlinear dynamical systems

Arxiv

0+阅读 · 2024年9月26日

The poison of dimensionality

Arxiv

0+阅读 · 2024年9月25日

A random measure approach to reinforcement learning in continuous time

Arxiv

0+阅读 · 2024年9月25日

Geometric multimodal representation learning

Arxiv

69+阅读 · 2022年9月7日

A psychological theory of explainability

Arxiv

16+阅读 · 2022年5月17日

How to represent part-whole hierarchies in a neural network

Arxiv

13+阅读 · 2021年2月25日

相关基金

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

47+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

PPP项目争端谈判及其治理机制研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于决策模型和预备电位的运动想象BCI研究

国家自然科学基金

3+阅读 · 2015年12月31日

哈密尔顿系统及多体问题的周期解

国家自然科学基金

0+阅读 · 2014年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

复杂多元数据的半参数统计推断

国家自然科学基金

5+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

概率抽样设计及其统计推断方法

国家自然科学基金

6+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员