Navigating the Grey Area: Expressions of Overconfidence and Uncertainty in Language Models

Despite increasingly fluent, relevant, and coherent language generation, major gaps remain between how humans and machines use language. We argue that a key dimension that is missing from our understanding of language models (LMs) is the model's ability to interpret and generate expressions of uncertainty. Whether it be the weatherperson announcing a chance of rain or a doctor giving a diagnosis, information is often not black-and-white and expressions of uncertainty provide nuance to support human-decision making. The increasing deployment of LMs in the wild motivates us to investigate whether LMs are capable of interpreting expressions of uncertainty and how LMs' behaviors change when learning to emit their own expressions of uncertainty. When injecting expressions of uncertainty into prompts (e.g., "I think the answer is..."), we discover that GPT3's generations vary upwards of 80% in accuracy based on the expression used. We analyze the linguistic characteristics of these expressions and find a drop in accuracy when naturalistic expressions of certainty are present. We find similar effects when teaching models to emit their own expressions of uncertainty, where model calibration suffers when teaching models to emit certainty rather than uncertainty. Together, these results highlight the challenges of building LMs that interpret and generate trustworthy expressions of uncertainty.

翻译：尽管语言生成越来越流畅、相关且连贯，但人类与机器使用语言的方式之间仍存在重大差距。我们认为，当前对语言模型（LMs）理解中缺失的一个关键维度，是模型解释和生成不确定性表达的能力。无论是天气预报员预告降雨概率，还是医生给出诊断，信息往往并非非黑即白，而不确定性表达则为支持人类决策提供了细微差别。随着语言模型在现实场景中的日益部署，我们有必要探究语言模型是否能够解释不确定性表达，以及当模型学会发出自身的不确定性表达时，其行为会如何变化。当在提示中注入不确定性表达（例如“我认为答案是……”）时，我们发现GPT3的生成准确率会因所使用的表达不同而出现高达80%的波动。我们分析了这些表达的语言特征，并发现当存在自然形态的确定性表达时，准确率会下降。我们还在教导模型生成自身的不确定性表达时发现了类似效应：当教导模型生成确定性而非不确定性时，模型的校准性会变差。这些结果共同凸显了构建能够解释并生成可信赖不确定性表达的语言模型所面临的挑战。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日