ChatGPT and post-test probability

Reinforcement learning-based large language models, such as ChatGPT, are believed to have potential to aid human experts in many domains, including healthcare. There is, however, little work on ChatGPT's ability to perform a key task in healthcare: formal, probabilistic medical diagnostic reasoning. This type of reasoning is used, for example, to update a pre-test probability to a post-test probability. In this work, we probe ChatGPT's ability to perform this task. In particular, we ask ChatGPT to give examples of how to use Bayes rule for medical diagnosis. Our prompts range from queries that use terminology from pure probability (e.g., requests for a "posterior probability") to queries that use terminology from the medical diagnosis literature (e.g., requests for a "post-test probability"). We show how the introduction of medical variable names leads to an increase in the number of errors that ChatGPT makes. Given our results, we also show how one can use prompt engineering to facilitate ChatGPT's partial avoidance of these errors. We discuss our results in light of recent commentaries on sensitivity and specificity. We also discuss how our results might inform new research directions for large language models.

翻译：基于强化学习的大语言模型（如ChatGPT）被认为具有辅助人类专家处理包括医疗健康在内的多个领域任务的潜力。然而，目前鲜有研究探讨ChatGPT在医疗健康核心任务——即正式的、概率性医学诊断推理——中的表现。此类推理常用于将验前概率更新为验后概率。本研究旨在探究ChatGPT执行该任务的能力。具体而言，我们要求ChatGPT提供使用贝叶斯规则进行医学诊断的示例。我们的提示词从使用纯概率术语（如要求计算"后验概率"）到使用医学诊断文献中的术语（如要求计算"验后概率"）。研究表明，引入医学变量名称会导致ChatGPT错误数量的增加。基于实验结果，我们还展示了如何通过提示工程帮助ChatGPT部分规避这些错误。我们结合近期关于敏感度与特异度的评述文献讨论了研究结果，并探讨了这些发现对大型语言模型新研究方向的可能启示。

相关内容

ChatGPT

关注 258

ChatGPT（全名：Chat Generative Pre-trained Transformer），美国OpenAI 研发的聊天机器人程序 [1] ，于2022年11月30日发布。ChatGPT是人工智能技术驱动的自然语言处理工具，它能够通过学习和理解人类的语言来进行对话，还能根据聊天的上下文进行互动，真正像人类一样来聊天交流，甚至能完成撰写邮件、视频脚本、文案、翻译、代码，写论文任务。 [1] https://openai.com/blog/chatgpt/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

【AI应用】Facebook-利用神经网络求解高等数学方程, Using neural networks to solve advanced mathematics equations

专知会员服务

34+阅读 · 2020年1月15日