We explore the extension of chain-of-thought (CoT) prompting to medical reasoning for the task of automatic diagnosis. Motivated by doctors' underlying reasoning process, we present Diagnostic-Reasoning CoT (DR-CoT). Empirical results demonstrate that by simply prompting large language models trained only on general text corpus with two DR-CoT exemplars, the diagnostic accuracy improves by 15% comparing to standard prompting. Moreover, the gap reaches a pronounced 18% in out-domain settings. Our findings suggest expert-knowledge reasoning in large language models can be elicited through proper promptings.
翻译:我们探索将思维链(CoT)提示扩展到面向自动诊断任务的医学推理中。受医生潜在推理过程的启发,我们提出了诊断推理思维链(DR-CoT)。实验结果表明,仅通过向仅在通用文本语料库上训练的大语言模型提供两个DR-CoT示例,与标准提示相比,诊断准确率便提升了15%。此外,在域外场景下这一差距达到了显著的18%。我们的发现表明,大语言模型中的专家知识推理能力可通过恰当的提示予以激发。