The technical landscape of clinical machine learning is shifting in ways that destabilize pervasive assumptions about the nature and causes of algorithmic bias. On one hand, the dominant paradigm in clinical machine learning is narrow in the sense that models are trained on biomedical datasets for particular clinical tasks such as diagnosis and treatment recommendation. On the other hand, the emerging paradigm is generalist in the sense that general-purpose language models such as Google's BERT and PaLM are increasingly being adapted for clinical use cases via prompting or fine-tuning on biomedical datasets. Many of these next-generation models provide substantial performance gains over prior clinical models, but at the same time introduce novel kinds of algorithmic bias and complicate the explanatory relationship between algorithmic biases and biases in training data. This paper articulates how and in what respects biases in generalist models differ from biases in prior clinical models, and draws out practical recommendations for algorithmic bias mitigation.
翻译:临床机器学习的技术格局正在发生转变,这动摇了对算法偏见本质和成因的普遍假设。一方面,临床机器学习的主流范式是狭隘的,即模型在生物医学数据集上针对特定临床任务(如诊断和治疗推荐)进行训练。另一方面,新兴范式是通用的,即像Google的BERT和PaLM这样的通用语言模型正越来越多地通过提示或针对生物医学数据集的微调被应用于临床场景。许多这些下一代模型相比之前的临床模型在性能上有显著提升,但同时也引入了新型的算法偏见,并使算法偏见与训练数据中的偏见之间的解释关系复杂化。本文阐述了通用模型中的偏见在哪些方面以及多大程度上不同于以往临床模型中的偏见,并提出了减轻算法偏见的实用建议。