Generative AI (GenAI) models have demonstrated remarkable capabilities in a wide variety of medical tasks. However, as these models are trained using generalist datasets with very limited human oversight, they can learn uses of medical products that have not been adequately evaluated for safety and efficacy, nor approved by regulatory agencies. Given the scale at which GenAI may reach users, unvetted recommendations pose a public health risk. In this work, we propose an approach to identify potentially harmful product recommendations, and demonstrate it using a recent multimodal large language model.
翻译:生成式AI(GenAI)模型已在多种医疗任务中展现出卓越能力。然而,由于这些模型使用人工监督极为有限的通用数据集进行训练,它们可能习得医疗产品的某些用途——这些用途既未经过充分的安全性与有效性评估,也未获得监管机构批准。鉴于生成式AI可能触达用户的规模,未经审查的推荐将构成公共卫生风险。本研究提出一种识别潜在有害产品推荐的方法,并利用最新的多模态大语言模型进行了验证。