Despite increasingly fluent, relevant, and coherent language generation, major gaps remain between how humans and machines use language. We argue that a key dimension that is missing from our understanding of language models (LMs) is the model's ability to interpret and generate expressions of uncertainty. Whether it be the weatherperson announcing a chance of rain or a doctor giving a diagnosis, information is often not black-and-white and expressions of uncertainty provide nuance to support human-decision making. The increasing deployment of LMs in the wild motivates us to investigate whether LMs are capable of interpreting expressions of uncertainty and how LMs' behaviors change when learning to emit their own expressions of uncertainty. When injecting expressions of uncertainty into prompts (e.g., "I think the answer is..."), we discover that GPT3's generations vary upwards of 80% in accuracy based on the expression used. We analyze the linguistic characteristics of these expressions and find a drop in accuracy when naturalistic expressions of certainty are present. We find similar effects when teaching models to emit their own expressions of uncertainty, where model calibration suffers when teaching models to emit certainty rather than uncertainty. Together, these results highlight the challenges of building LMs that interpret and generate trustworthy expressions of uncertainty.
翻译:尽管语言生成越来越流畅、相关且连贯,但人类与机器使用语言的方式之间仍存在重大差距。我们认为,当前对语言模型(LMs)理解中缺失的一个关键维度,是模型解释和生成不确定性表达的能力。无论是天气预报员预告降雨概率,还是医生给出诊断,信息往往并非非黑即白,而不确定性表达则为支持人类决策提供了细微差别。随着语言模型在现实场景中的日益部署,我们有必要探究语言模型是否能够解释不确定性表达,以及当模型学会发出自身的不确定性表达时,其行为会如何变化。当在提示中注入不确定性表达(例如“我认为答案是……”)时,我们发现GPT3的生成准确率会因所使用的表达不同而出现高达80%的波动。我们分析了这些表达的语言特征,并发现当存在自然形态的确定性表达时,准确率会下降。我们还在教导模型生成自身的不确定性表达时发现了类似效应:当教导模型生成确定性而非不确定性时,模型的校准性会变差。这些结果共同凸显了构建能够解释并生成可信赖不确定性表达的语言模型所面临的挑战。