Evaluating Machine Translation Quality with Conformal Predictive Distributions - 专知论文

会员服务 ·

0

Conformer · Machine Translation · 得分 · 可交换的 · 覆盖 ·

2023 年 6 月 2 日

Evaluating Machine Translation Quality with Conformal Predictive Distributions

翻译：基于共形预测分布的机器翻译质量评估

Patrizio Giovannotti

from arxiv, Accepted at the 12th Symposium on Conformal and Probabilistic Prediction with Applications, COPA 2023

This paper presents a new approach for assessing uncertainty in machine translation by simultaneously evaluating translation quality and providing a reliable confidence score. Our approach utilizes conformal predictive distributions to produce prediction intervals with guaranteed coverage, meaning that for any given significance level $\epsilon$, we can expect the true quality score of a translation to fall out of the interval at a rate of $1-\epsilon$. In this paper, we demonstrate how our method outperforms a simple, but effective baseline on six different language pairs in terms of coverage and sharpness. Furthermore, we validate that our approach requires the data exchangeability assumption to hold for optimal performance.

翻译：本文提出一种评估机器翻译不确定性的新方法，该方法通过同步评估翻译质量并提供可靠的置信度分数。我们的方法利用共形预测分布生成具有保证覆盖率的预测区间，这意味着对于任意给定的显著性水平$\epsilon$，翻译的真实质量分数以$1-\epsilon$的概率落在区间之外。在本文中，我们展示了该方法在覆盖率和锐度两方面优于六种不同语言对上的简单有效基线。此外，我们验证了该方法需要满足数据可交换性假设才能获得最优性能。

1

相关内容

Conformer

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【开放书】卡耐基梅隆大学Elaine Shi 教授《Foundations of Distributed Consensus and Blockchains（分布式共识和区块链的基础）》150页pdf

【开放书】卡耐基梅隆大学Elaine Shi 教授《Foundations of Distributed Consensus and Blockchains（分布式共识和区块链的基础）》150页pdf

专知会员服务

30+阅读 · 2022年2月22日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

116+阅读 · 2020年4月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

247+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Decorin对急性缺血性卒中后血脑屏障中ZO-1蛋白的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于概率校准和集成学习的出生缺陷发病风险预测模型研究

国家自然科学基金

0+阅读 · 2015年12月31日

极化增强的AlGaN日盲雪崩光电探测器研究

国家自然科学基金

0+阅读 · 2014年12月31日

二亚硝基哌嗪（DNP）介导Clusterin表达参与鼻咽癌转移的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

表面等离激元增强AlGaN基深紫外LED研究

国家自然科学基金

0+阅读 · 2012年12月31日

SiC/Ti基复合材料C/Mo双涂层界面改性机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

拟南芥DIF（DRIP1-Interacting Factor）在胁迫信号应答中的功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

microRNA-29b介导血管平滑肌细胞AT1aR基因DNA去甲基化参与高血压发病机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Tm,Ho晶体的种子光注入锁定可连续调谐2μ#21333;频激光器

国家自然科学基金

0+阅读 · 2008年12月31日

Evaluating the Ripple Effects of Knowledge Editing in Language Models

Evaluating the Ripple Effects of Knowledge Editing in Language Models

Arxiv

0+阅读 · 2023年7月24日

Causal Fair Machine Learning via Rank-Preserving Interventional Distributions

Arxiv

0+阅读 · 2023年7月24日

FSLens: A Visual Analytics Approach to Evaluating and Optimizing the Spatial Layout of Fire Stations

Arxiv

0+阅读 · 2023年7月23日

Safety-Assured Speculative Planning with Adaptive Prediction

Arxiv

0+阅读 · 2023年7月21日

Towards Non-Parametric Models for Confidence Aware Image Prediction from Low Data using Gaussian Processes

Arxiv

0+阅读 · 2023年7月20日

Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification

Arxiv

0+阅读 · 2023年7月20日

Flexible machine learning estimation of conditional average treatment effects: a blessing and a curse

Arxiv

0+阅读 · 2023年7月20日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

An Overview on Machine Translation Evaluation

An Overview on Machine Translation Evaluation

Arxiv

14+阅读 · 2022年2月22日

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Arxiv

11+阅读 · 2019年11月4日

VIP会员

文章信息

相关主题

Machine Translation

最新内容

综述 | Memory for Large Language Models：大模型记忆机制全景

综述 | Memory for Large Language Models：大模型记忆机制全景

专知会员服务

0+阅读 · 7月29日

博士论文 | Riemannian Deep Learning：模块、网络与几何

博士论文 | Riemannian Deep Learning：模块、网络与几何

专知会员服务

0+阅读 · 7月29日

《越野作战环境下路径规划的多准则整数规划模型》

《越野作战环境下路径规划的多准则整数规划模型》

专知会员服务

5+阅读 · 7月29日

人工智能大语言模型引擎如何重塑全球冲突信息环境最新50页

人工智能大语言模型引擎如何重塑全球冲突信息环境最新50页

专知会员服务

3+阅读 · 7月29日

《防空系统对自主武器系统辩论中“有意义的人类控制”的启示》70页报告

《防空系统对自主武器系统辩论中“有意义的人类控制”的启示》70页报告

专知会员服务

3+阅读 · 7月29日

“对标ChatGPT”：乌军研发Marichka AI系统用于战场筹划

“对标ChatGPT”：乌军研发Marichka AI系统用于战场筹划

专知会员服务

7+阅读 · 7月29日

《同步多无人机系统中的故障与通信》

《同步多无人机系统中的故障与通信》

专知会员服务

2+阅读 · 7月29日

论文解读 | 医学图像修复中的扩散模型：挑战、分类与未来方向

论文解读 | 医学图像修复中的扩散模型：挑战、分类与未来方向

专知会员服务

4+阅读 · 7月28日

博士论文 | 从算法到基础模型：强化学习的统一视角

博士论文 | 从算法到基础模型：强化学习的统一视角

专知会员服务

9+阅读 · 7月28日

面向国防作战的最佳自主与蜂群无人机技术

面向国防作战的最佳自主与蜂群无人机技术

专知会员服务

7+阅读 · 7月28日

《异构人类团队的协作决策过程混合建模研究》

《异构人类团队的协作决策过程混合建模研究》

专知会员服务

8+阅读 · 7月28日

《C5ISR系统中的注意力动态与自适应决策支持研究：视觉与多模态注意力引导对任务绩效影响的递归量化分析》最新36页报告

《C5ISR系统中的注意力动态与自适应决策支持研究：视觉与多模态注意力引导对任务绩效影响的递归量化分析》最新36页报告

专知会员服务

8+阅读 · 7月28日

《设计思维中的人机协作：生成式人工智能对共情访谈影响的探究》140页

《设计思维中的人机协作：生成式人工智能对共情访谈影响的探究》140页

专知会员服务

9+阅读 · 7月28日

博士论文 | 面向大模型推理的内存高效算法

博士论文 | 面向大模型推理的内存高效算法

专知会员服务

6+阅读 · 7月27日

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

专知会员服务

11+阅读 · 7月27日

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【开放书】卡耐基梅隆大学Elaine Shi 教授《Foundations of Distributed Consensus and Blockchains（分布式共识和区块链的基础）》150页pdf

【开放书】卡耐基梅隆大学Elaine Shi 教授《Foundations of Distributed Consensus and Blockchains（分布式共识和区块链的基础）》150页pdf

专知会员服务

30+阅读 · 2022年2月22日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

116+阅读 · 2020年4月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

247+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

博士论文 | Riemannian Deep Learning：模块、网络与几何

人工智能大语言模型引擎如何重塑全球冲突信息环境最新50页

综述 | Memory for Large Language Models：大模型记忆机制全景

《越野作战环境下路径规划的多准则整数规划模型》

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Evaluating the Ripple Effects of Knowledge Editing in Language Models

Evaluating the Ripple Effects of Knowledge Editing in Language Models

Arxiv

0+阅读 · 2023年7月24日

Causal Fair Machine Learning via Rank-Preserving Interventional Distributions

Arxiv

0+阅读 · 2023年7月24日

FSLens: A Visual Analytics Approach to Evaluating and Optimizing the Spatial Layout of Fire Stations

Arxiv

0+阅读 · 2023年7月23日

Safety-Assured Speculative Planning with Adaptive Prediction

Arxiv

0+阅读 · 2023年7月21日

Towards Non-Parametric Models for Confidence Aware Image Prediction from Low Data using Gaussian Processes

Arxiv

0+阅读 · 2023年7月20日

Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification

Arxiv

0+阅读 · 2023年7月20日

Flexible machine learning estimation of conditional average treatment effects: a blessing and a curse

Arxiv

0+阅读 · 2023年7月20日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

An Overview on Machine Translation Evaluation

An Overview on Machine Translation Evaluation

Arxiv

14+阅读 · 2022年2月22日

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Arxiv

11+阅读 · 2019年11月4日

相关基金

Decorin对急性缺血性卒中后血脑屏障中ZO-1蛋白的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于概率校准和集成学习的出生缺陷发病风险预测模型研究

国家自然科学基金

0+阅读 · 2015年12月31日

极化增强的AlGaN日盲雪崩光电探测器研究

国家自然科学基金

0+阅读 · 2014年12月31日

二亚硝基哌嗪（DNP）介导Clusterin表达参与鼻咽癌转移的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

表面等离激元增强AlGaN基深紫外LED研究

国家自然科学基金

0+阅读 · 2012年12月31日

SiC/Ti基复合材料C/Mo双涂层界面改性机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

拟南芥DIF（DRIP1-Interacting Factor）在胁迫信号应答中的功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

microRNA-29b介导血管平滑肌细胞AT1aR基因DNA去甲基化参与高血压发病机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Tm,Ho晶体的种子光注入锁定可连续调谐2μ#21333;频激光器

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员