Enhancing Robustness in Large Language Models: Prompting for Mitigating the Impact of Irrelevant Information - 专知论文

会员服务 ·

0

INFORMS · Prompt · Performer · 可辨认的 · 稳健性 ·

2024 年 8 月 20 日

Enhancing Robustness in Large Language Models: Prompting for Mitigating the Impact of Irrelevant Information

翻译：增强大型语言模型的鲁棒性：通过提示技术缓解无关信息的影响

Ming Jiang,Tingting Huang,Biao Guo,Yao Lu,Feng Zhang

In recent years, Large language models (LLMs) have garnered significant attention due to their superior performance in complex reasoning tasks. However, recent studies may diminish their reasoning capabilities markedly when problem descriptions contain irrelevant information, even with the use of advanced prompting techniques. To further investigate this issue, a dataset of primary school mathematics problems containing irrelevant information, named GSMIR, was constructed. Testing prominent LLMs and prompting techniques on this dataset revealed that while LLMs can identify irrelevant information, they do not effectively mitigate the interference it causes once identified. A novel automatic construction method, ATF, which enhances the ability of LLMs to identify and self-mitigate the influence of irrelevant information, is proposed to address this shortcoming. This method operates in two steps: first, analysis of irrelevant information, followed by its filtering. The ATF method, as demonstrated by experimental results, significantly improves the reasoning performance of LLMs and prompting techniques, even in the presence of irrelevant information on the GSMIR dataset.

翻译：近年来，大型语言模型（LLMs）因其在复杂推理任务中的卓越表现而备受关注。然而，近期研究表明，当问题描述中包含无关信息时，即使采用先进的提示技术，其推理能力也可能显著下降。为深入探究此问题，本研究构建了一个包含无关信息的小学数学问题数据集，命名为GSMIR。在该数据集上测试主流LLMs及提示技术后发现，尽管LLMs能够识别无关信息，但在识别后并不能有效缓解其带来的干扰。针对这一不足，本文提出了一种新颖的自动构建方法ATF，旨在增强LLMs识别并自我缓解无关信息影响的能力。该方法分两步进行：首先分析无关信息，随后对其进行过滤。实验结果表明，在GSMIR数据集上，即使存在无关信息，ATF方法也能显著提升LLMs及提示技术的推理性能。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

“Fishes-in-net” 酵母孢子微胶囊式近平滑假丝酵母SCRII酶有机相高效手性合成机制研究

国家自然科学基金

3+阅读 · 2016年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

DMB信号水汽探测方法若干问题研究

国家自然科学基金

3+阅读 · 2015年12月31日

太阳活动11年周期影响春季NAO与东亚夏季风关系的过程及机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

“杰文斯”悖论、能效政策改进与“双控目标”分解

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

Neolaxiflorin B的全合成研究

国家自然科学基金

0+阅读 · 2014年12月31日

Enhance Reasoning by Learning from Mistakes: Peer-Review Knowledge Distillation from Multiple Large Language Models

Arxiv

0+阅读 · 2024年10月4日

PreAlign: Boosting Cross-Lingual Transfer by Early Establishment of Multilingual Alignment

Arxiv

0+阅读 · 2024年10月4日

LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations

Arxiv

0+阅读 · 2024年10月3日

Preparing for Super-Reactivity: Early Fault-Detection in the Development of Exceedingly Complex Reactive Systems

Arxiv

0+阅读 · 2024年10月3日

TuBA: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning

Arxiv

0+阅读 · 2024年10月2日

KnobGen: Controlling the Sophistication of Artwork in Sketch-Based Diffusion Models

Arxiv

0+阅读 · 2024年10月2日

Trends in Integration of Knowledge and Large Language Models: A Survey and Taxonomy of Methods, Benchmarks, and Applications

Arxiv

19+阅读 · 2023年11月10日

Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models

Arxiv

24+阅读 · 2023年9月3日

When Large Language Models Meet Personalization: Perspectives of Challenges and Opportunities

Arxiv

19+阅读 · 2023年7月31日

Beyond One-Model-Fits-All: A Survey of Domain Specialization for Large Language Models

Arxiv

66+阅读 · 2023年5月31日

VIP会员

文章信息

相关主题

最新内容

ICML 2026 | 演化选择的因果建模

ICML 2026 | 演化选择的因果建模

专知会员服务

0+阅读 · 今天15:45

综述｜学习式3D表征最新进展与趋势

综述｜学习式3D表征最新进展与趋势

专知会员服务

1+阅读 · 今天15:37

《武器作战效能分析：基于虚拟构造仿真大数据与深度学习的初步见解》

《武器作战效能分析：基于虚拟构造仿真大数据与深度学习的初步见解》

专知会员服务

4+阅读 · 今天14:53

《自主巡飞弹药系统量子逻辑框架：一种基于不确定模糊集的方法》

《自主巡飞弹药系统量子逻辑框架：一种基于不确定模糊集的方法》

专知会员服务

3+阅读 · 今天14:47

人工智能重塑威慑：算法优势的兴起

人工智能重塑威慑：算法优势的兴起

专知会员服务

3+阅读 · 今天14:27

【博士论文】基于物理结构与贝叶斯不确定性的可靠神经网络

【博士论文】基于物理结构与贝叶斯不确定性的可靠神经网络

专知会员服务

10+阅读 · 6月4日

AgentOps综述：智能体系统运维框架

AgentOps综述：智能体系统运维框架

专知会员服务

14+阅读 · 6月4日

《美陆军最新条令：兵力防护》

《美陆军最新条令：兵力防护》

专知会员服务

9+阅读 · 6月4日

《军用物联网：架构、应用、挑战与现代战争中的战略意义》

《军用物联网：架构、应用、挑战与现代战争中的战略意义》

专知会员服务

8+阅读 · 6月4日

《人工智能的挑战：算法战的想象与现实》

《人工智能的挑战：算法战的想象与现实》

专知会员服务

11+阅读 · 6月4日

《自适应智能：融合数字孪生精准性与人工智能预见力，实现实时决策》

《自适应智能：融合数字孪生精准性与人工智能预见力，实现实时决策》

专知会员服务

13+阅读 · 6月4日

首场人工智能战争：Maven如何重塑武装冲突

首场人工智能战争：Maven如何重塑武装冲突

专知会员服务

7+阅读 · 6月4日

【博士论文】抽象信息论与安全奖励学习的数学发展

【博士论文】抽象信息论与安全奖励学习的数学发展

专知会员服务

9+阅读 · 6月3日

综述 | 机器人操作世界模型：预测、行动接口与学习生命周期

综述 | 机器人操作世界模型：预测、行动接口与学习生命周期

专知会员服务

6+阅读 · 6月3日

《推进军事决策支持：运用强化学习驱动仿真的稳健作战计划验证》

《推进军事决策支持：运用强化学习驱动仿真的稳健作战计划验证》

专知会员服务

13+阅读 · 6月3日

相关VIP内容

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

综述｜学习式3D表征最新进展与趋势

《自主巡飞弹药系统量子逻辑框架：一种基于不确定模糊集的方法》

ICML 2026 | 演化选择的因果建模

《武器作战效能分析：基于虚拟构造仿真大数据与深度学习的初步见解》

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

相关论文

Enhance Reasoning by Learning from Mistakes: Peer-Review Knowledge Distillation from Multiple Large Language Models

Arxiv

0+阅读 · 2024年10月4日

PreAlign: Boosting Cross-Lingual Transfer by Early Establishment of Multilingual Alignment

Arxiv

0+阅读 · 2024年10月4日

LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations

Arxiv

0+阅读 · 2024年10月3日

Preparing for Super-Reactivity: Early Fault-Detection in the Development of Exceedingly Complex Reactive Systems

Arxiv

0+阅读 · 2024年10月3日

TuBA: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning

Arxiv

0+阅读 · 2024年10月2日

KnobGen: Controlling the Sophistication of Artwork in Sketch-Based Diffusion Models

Arxiv

0+阅读 · 2024年10月2日

Trends in Integration of Knowledge and Large Language Models: A Survey and Taxonomy of Methods, Benchmarks, and Applications

Arxiv

19+阅读 · 2023年11月10日

Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models

Arxiv

24+阅读 · 2023年9月3日

When Large Language Models Meet Personalization: Perspectives of Challenges and Opportunities

Arxiv

19+阅读 · 2023年7月31日

Beyond One-Model-Fits-All: A Survey of Domain Specialization for Large Language Models

Arxiv

66+阅读 · 2023年5月31日

相关基金

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

“Fishes-in-net” 酵母孢子微胶囊式近平滑假丝酵母SCRII酶有机相高效手性合成机制研究

国家自然科学基金

3+阅读 · 2016年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

DMB信号水汽探测方法若干问题研究

国家自然科学基金

3+阅读 · 2015年12月31日

太阳活动11年周期影响春季NAO与东亚夏季风关系的过程及机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

“杰文斯”悖论、能效政策改进与“双控目标”分解

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

Neolaxiflorin B的全合成研究

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员