Transformer-based clinical language models are increasingly integrated into high-stakes clinical decision support pipelines, yet the computational mechanisms through which demographic associations encoded in medical documentation propagate into model probability distributions remain empirically underspecified. We present a systematic computational audit of representational bias in ClinicalBERT (Alsentzer et al., 2019), a BERT-based model pretrained on MIMIC-III discharge summaries, employing two complementary probing methodologies: Log Probability Bias Analysis (LPBA), which quantifies demographic descriptor-induced shifts in masked token probability distributions across behavioral and evaluative semantic categories, and Masked Language Model-based analysis (MLM), which probes internal representational structure for demographic agency attribution encoding across 98 real clinical sentence templates and eight intersectional race-gender combinations. Corpus frequency analysis operationalizes the distinction between statistical disparity and bias amplification by benchmarking model outputs against empirical term frequencies in the MIMIC-III training corpus. Of 32 statistically significant findings, 65.6% contradict observed corpus distributions, rising to 80% for Black patients and 87.5% for agency attribution under MLM probing, providing direct empirical evidence that representational bias in ClinicalBERT operates predominantly through model-internal amplification rather than training data inheritance. Keywords: natural language processing, clinical documentation, algorithmic auditing, representational bias, health equity 1
翻译:基于Transformer的临床语言模型正日益整合到高风险临床决策支持流程中,然而医学文档中编码的人口统计学关联通过何种计算机制传播到模型概率分布,仍缺乏充分的实证研究。我们对ClinicalBERT(Alsentzer等,2019)——一个在MIMIC-III出院小结上预训练的BERT模型——进行了系统的表征偏差计算审计,采用两种互补的探查方法:对数概率偏差分析(LPBA),量化人口统计学描述符引起的掩码令牌概率分布在行为和评价语义类别上的偏移;以及基于掩码语言模型的分析(MLM),探查98个真实临床句子模板和八种交叉种族-性别组合中人口统计学主体归因编码的内部表征结构。语料库频率分析通过将模型输出与MIMIC-III训练语料库中的经验术语频率进行对比,操作化了统计差异与偏差放大之间的区分。在32项具有统计显著性的发现中,65.6%与观测到的语料库分布相矛盾,在黑人群体的MLM探查下这一比例升至80%,在主体归因分析中升至87.5%,这提供了直接的经验证据,表明ClinicalBERT中的表征偏差主要通过模型内部放大而非训练数据继承发挥作用。关键词:自然语言处理,临床文档,算法审计,表征偏差,健康公平