What Comes Next? Evaluating Uncertainty in Neural Text Generators Against Human Production Variability - 专知论文

会员服务 ·

0

Next · 解码 · MoDELS · 可理解性 · 塑造 ·

2023 年 5 月 19 日

What Comes Next? Evaluating Uncertainty in Neural Text Generators Against Human Production Variability

翻译：接下来是什么？评估神经文本生成器在人机生产变异性中的不确定性

Mario Giulianelli,Joris Baan,Wilker Aziz,Raquel Fernández,Barbara Plank

In Natural Language Generation (NLG) tasks, for any input, multiple communicative goals are plausible, and any goal can be put into words, or produced, in multiple ways. We characterise the extent to which human production varies lexically, syntactically, and semantically across four NLG tasks, connecting human production variability to aleatoric or data uncertainty. We then inspect the space of output strings shaped by a generation system's predicted probability distribution and decoding algorithm to probe its uncertainty. For each test input, we measure the generator's calibration to human production variability. Following this instance-level approach, we analyse NLG models and decoding strategies, demonstrating that probing a generator with multiple samples and, when possible, multiple references, provides the level of detail necessary to gain understanding of a model's representation of uncertainty.

翻译：在自然语言生成（NLG）任务中，对于任何输入，都可能存在多个合理的交际目标，且每个目标可以以多种方式用文字表达或生成。我们刻画了人类在四个NLG任务中词汇、句法和语义上的生产变异性程度，并将人类生产变异性与偶然性（或数据）不确定性联系起来。随后，我们通过生成系统预测的概率分布和解码算法所形成的输出字符串空间，来探查其不确定性。针对每个测试输入，我们衡量生成器对人类生产变异性的校准度。遵循这一实例级方法，我们分析了NLG模型和解码策略，证明通过使用多个样本以及可能情况下的多个参考来探查生成器，能够提供理解模型不确定性表征所需的详细程度。

0

相关内容

Next

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

94+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

80+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AI可解释性文献列表

AI可解释性文献列表

专知

43+阅读 · 2019年10月7日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

系统科学与复杂性学报（英文版）

国家自然科学基金

12+阅读 · 2015年12月31日

ATF诱导Th1/Th2漂移构建血管化人工胰岛的研究

国家自然科学基金

0+阅读 · 2014年12月31日

稀疏植被覆盖条件下土壤盐渍化高光谱遥感定量反演与动态监测

国家自然科学基金

0+阅读 · 2014年12月31日

非线性离散可积方程与离散Painlevé方程族的连续极限理论

国家自然科学基金

0+阅读 · 2013年12月31日

黄河源区地面观测土壤湿度升空间尺度转换研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于GΓMM和稀疏表示的高分辨率SAR图像变化检测研究

国家自然科学基金

0+阅读 · 2012年12月31日

城市污泥黄土改性后重金属形态的转化及其生物效应

国家自然科学基金

0+阅读 · 2011年12月31日

基于WRF模式系统的InSAR大气校正方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

面向供应链的高级计划与排程的建模与优化方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

锰过氧化物酶/纳米二氧化硅(MnP/Nano-SiO2)体系催化降解碱木质素的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Style Over Substance: Evaluation Biases for Large Language Models

Arxiv

0+阅读 · 2023年7月6日

Natural Language Deduction with Incomplete Information

Arxiv

0+阅读 · 2023年7月5日

Comparative Analysis of GPT-4 and Human Graders in Evaluating Praise Given to Students in Synthetic Dialogues

Arxiv

0+阅读 · 2023年7月5日

Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels

Arxiv

0+阅读 · 2023年7月5日

Beyond In-Domain Scenarios: Robust Density-Aware Calibration

Arxiv

0+阅读 · 2023年7月4日

Shifting Attention to Relevance: Towards the Uncertainty Estimation of Large Language Models

Arxiv

0+阅读 · 2023年7月3日

Nonparametric Bayesian approach for quantifying the conditional uncertainty of input parameters in chained numerical models

Arxiv

0+阅读 · 2023年7月3日

Trading-Off Payments and Accuracy in Online Classification with Paid Stochastic Experts

Arxiv

0+阅读 · 2023年7月3日

A Survey of Uncertainty in Deep Neural Networks

Arxiv

30+阅读 · 2021年7月7日

Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods

Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods

Arxiv

15+阅读 · 2020年4月3日

VIP会员

文章信息

相关主题

最新内容

《曝光下的战争：战场过滤与乌克兰军事选择的窄化》

《曝光下的战争：战场过滤与乌克兰军事选择的窄化》

专知会员服务

2+阅读 · 今天7:13

俄乌无人机战争的六大启示

俄乌无人机战争的六大启示

专知会员服务

4+阅读 · 今天7:07

《无人机空中监控：通信实验洞察》

《无人机空中监控：通信实验洞察》

专知会员服务

3+阅读 · 今天7:05

《无全球定位系统及通信拒止环境下用于地面目标防护的分布式无人机蜂群》（含代码）

《无全球定位系统及通信拒止环境下用于地面目标防护的分布式无人机蜂群》（含代码）

专知会员服务

3+阅读 · 今天6:59

从采集到决策：美军视角下的战术情报范式重构

从采集到决策：美军视角下的战术情报范式重构

专知会员服务

12+阅读 · 8月2日

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

专知会员服务

5+阅读 · 8月2日

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

专知会员服务

10+阅读 · 8月2日

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

专知会员服务

12+阅读 · 8月2日

《履带式无人地面战车技术发展现状》

《履带式无人地面战车技术发展现状》

专知会员服务

6+阅读 · 8月2日

《美国空军B-2“幽灵”隐身轰炸机系统工程案例研究》117页

《美国空军B-2“幽灵”隐身轰炸机系统工程案例研究》117页

专知会员服务

10+阅读 · 8月1日

隐身技术前沿综述：物理机理、工程实践与战略展望

隐身技术前沿综述：物理机理、工程实践与战略展望

专知会员服务

8+阅读 · 8月1日

《多变海洋环境下无人水面艇与自主水下机器人对接的最优路径规划》

《多变海洋环境下无人水面艇与自主水下机器人对接的最优路径规划》

专知会员服务

9+阅读 · 8月1日

《以机反机：基于无人机载麦克风的空中周界入侵检测》

《以机反机：基于无人机载麦克风的空中周界入侵检测》

专知会员服务

8+阅读 · 8月1日

《无人机脆弱性利用：网络空间力量的新域》

《无人机脆弱性利用：网络空间力量的新域》

专知会员服务

6+阅读 · 8月1日

美空军如何将人工智能从战场部署至后方机关

美空军如何将人工智能从战场部署至后方机关

专知会员服务

13+阅读 · 7月31日

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

94+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

80+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

俄乌无人机战争的六大启示

《无全球定位系统及通信拒止环境下用于地面目标防护的分布式无人机蜂群》（含代码）

《曝光下的战争：战场过滤与乌克兰军事选择的窄化》

《无人机空中监控：通信实验洞察》

相关资讯

AI可解释性文献列表

AI可解释性文献列表

专知

43+阅读 · 2019年10月7日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Style Over Substance: Evaluation Biases for Large Language Models

Arxiv

0+阅读 · 2023年7月6日

Natural Language Deduction with Incomplete Information

Arxiv

0+阅读 · 2023年7月5日

Comparative Analysis of GPT-4 and Human Graders in Evaluating Praise Given to Students in Synthetic Dialogues

Arxiv

0+阅读 · 2023年7月5日

Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels

Arxiv

0+阅读 · 2023年7月5日

Beyond In-Domain Scenarios: Robust Density-Aware Calibration

Arxiv

0+阅读 · 2023年7月4日

Shifting Attention to Relevance: Towards the Uncertainty Estimation of Large Language Models

Arxiv

0+阅读 · 2023年7月3日

Nonparametric Bayesian approach for quantifying the conditional uncertainty of input parameters in chained numerical models

Arxiv

0+阅读 · 2023年7月3日

Trading-Off Payments and Accuracy in Online Classification with Paid Stochastic Experts

Arxiv

0+阅读 · 2023年7月3日

A Survey of Uncertainty in Deep Neural Networks

Arxiv

30+阅读 · 2021年7月7日

Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods

Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods

Arxiv

15+阅读 · 2020年4月3日

相关基金

系统科学与复杂性学报（英文版）

国家自然科学基金

12+阅读 · 2015年12月31日

ATF诱导Th1/Th2漂移构建血管化人工胰岛的研究

国家自然科学基金

0+阅读 · 2014年12月31日

稀疏植被覆盖条件下土壤盐渍化高光谱遥感定量反演与动态监测

国家自然科学基金

0+阅读 · 2014年12月31日

非线性离散可积方程与离散Painlevé方程族的连续极限理论

国家自然科学基金

0+阅读 · 2013年12月31日

黄河源区地面观测土壤湿度升空间尺度转换研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于GΓMM和稀疏表示的高分辨率SAR图像变化检测研究

国家自然科学基金

0+阅读 · 2012年12月31日

城市污泥黄土改性后重金属形态的转化及其生物效应

国家自然科学基金

0+阅读 · 2011年12月31日

基于WRF模式系统的InSAR大气校正方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

面向供应链的高级计划与排程的建模与优化方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

锰过氧化物酶/纳米二氧化硅(MnP/Nano-SiO2)体系催化降解碱木质素的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员