Language models (LMs), including large language models (such as ChatGPT), have the potential to assist clinicians in generating various clinical notes. However, LMs are prone to produce ``hallucinations'', i.e., generated content that is not aligned with facts and knowledge. In this paper, we propose the Re$^3$Writer method with retrieval-augmented generation and knowledge-grounded reasoning to enable LMs to generate faithful clinical texts. We demonstrate the effectiveness of our method in generating patient discharge instructions. It requires the LMs not to only understand the patients' long clinical documents, i.e., the health records during hospitalization, but also to generate critical instructional information provided both to carers and to the patient at the time of discharge. The proposed Re$^3$Writer imitates the working patterns of physicians to first \textbf{re}trieve related working experience from historical instructions written by physicians, then \textbf{re}ason related medical knowledge. Finally, it \textbf{re}fines the retrieved working experience and reasoned medical knowledge to extract useful information, which is used to generate the discharge instructions for previously-unseen patients. Our experiments show that, using our method, the performance of five representative LMs can be substantially boosted across all metrics. Meanwhile, we show results from human evaluations to measure the effectiveness in terms of fluency, faithfulness, and comprehensiveness.
翻译:语言模型(包括ChatGPT等大型语言模型)在辅助临床医生生成各类临床记录方面具有潜力。然而,语言模型易产生“幻觉”,即生成与事实和知识不符的内容。本文提出融合检索增强生成与知识驱动推理的Re$^3$Writer方法,使语言模型能够生成可信的临床文本。我们展示了该方法在生成患者出院指导中的有效性。该方法要求语言模型不仅能理解患者的长篇临床文档(即住院期间的医疗记录),还需生成出院时提供给护理人员及患者的关键指导信息。所提出的Re$^3$Writer模拟医师的工作模式:首先从医师撰写的历史指导中**检**索相关工作经验,随后**推**理相关医学知识,最后**提**炼检索到的工作经验与推理出的医学知识以提取有效信息,用于为未见过的患者生成出院指导。实验表明,采用本方法可使五种代表性语言模型在所有评估指标上获得显著提升。同时,我们通过人工评估展示了该方法在流畅性、可信度与完整性方面的有效性。