Recent advances in generative vision-language models (VLMs) have exciting potential implications for AI in radiology, yet VLMs are also known to produce hallucinations, nonsensical text, and other unwanted behaviors that can waste clinicians' time and cause patient harm. Drawing on recent work on direct preference optimization (DPO), we propose a simple method for modifying the behavior of pretrained VLMs performing radiology report generation by suppressing unwanted types of generations. We apply our method to the prevention of hallucinations of prior exams, addressing a long-established problem behavior in models performing chest X-ray report generation. Across our experiments, we find that DPO fine-tuning achieves a 3.2-4.8x reduction in lines hallucinating prior exams while maintaining model performance on clinical accuracy metrics. Our work is, to the best of our knowledge, the first work to apply DPO to medical VLMs, providing a data- and compute- efficient way to suppress problem behaviors while maintaining overall clinical accuracy.
翻译:生成式视觉语言模型(VLM)的最新进展为人工智能在放射学中的应用带来了令人兴奋的潜在前景,然而,VLM也已知会产生幻觉、无意义的文本以及其他不良行为,这些行为可能浪费临床医生的时间并对患者造成伤害。借鉴近期关于直接偏好优化(DPO)的研究,我们提出了一种简单的方法,通过抑制不良生成类型来调整执行放射学报告生成的预训练VLM的行为。我们将该方法应用于防止既往检查的幻觉,解决了在执行胸部X光报告生成的模型中一个长期存在的不良行为问题。在我们的实验中,我们发现,DPO微调在保持模型临床准确性指标性能的同时,将幻觉既往检查的报告行数减少了3.2至4.8倍。据我们所知,我们的工作是首次将DPO应用于医学VLM,提供了一种数据与计算高效的方法来抑制不良行为,同时保持整体的临床准确性。