Microsatellite instability-high (MSI-H) is a tumor agnostic biomarker for immune checkpoint inhibitor therapy. However, MSI status is not routinely tested in prostate cancer, in part due to low prevalence and assay cost. As such, prediction of MSI status from hematoxylin and eosin (H&E) stained whole-slide images (WSIs) could identify prostate cancer patients most likely to benefit from confirmatory testing and becoming eligible for immunotherapy. Prostate biopsies and surgical resections from de-identified records of consecutive prostate cancer patients referred to our institution were analyzed. Their MSI status was determined by next generation sequencing. Patients before a cutoff date were split into an algorithm development set (n=4015, MSI-H 1.8%) and a paired validation set (n=173, MSI-H 19.7%) that consisted of two serial sections from each sample, one stained and scanned internally and the other at an external site. Patients after the cutoff date formed the temporal validation set (n=1350, MSI-H 2.3%). Attention-based multiple instance learning models were trained to predict MSI-H from H&E WSIs. The MSI-H predictor achieved area under the receiver operating characteristic curve values of 0.78 (95% CI [0.69-0.86]), 0.72 (95% CI [0.63-0.81]), and 0.72 (95% CI [0.62-0.82]) on the internally prepared, externally prepared, and temporal validation sets, respectively. While MSI-H status is significantly correlated with Gleason score, the model remained predictive within each Gleason score subgroup. In summary, we developed and validated an AI-based MSI-H diagnostic model on a large real-world cohort of routine H&E slides, which effectively generalized to externally stained and scanned samples and a temporally independent validation cohort. This algorithm has the potential to direct prostate cancer patients toward immunotherapy and to identify MSI-H cases secondary to Lynch syndrome.
翻译:微卫星高度不稳定性是免疫检查点抑制剂治疗的肿瘤不可知生物标志物。然而,前列腺癌中MSI状态并未常规检测,部分原因在于其低患病率和检测成本。因此,通过苏木精-伊红染色全切片图像预测MSI状态,可筛选出最可能从验证性检测中获益并符合免疫治疗条件的前列腺癌患者。本研究分析了来自我院连续前列腺癌患者去标识化病历中的前列腺穿刺活检和手术切除标本,并通过二代测序确定其MSI状态。以截止日期为界,患者被分为算法开发组(n=4015,MSI-H占1.8%)和配对验证组(n=173,MSI-H占19.7%),后者包含每个样本的两个连续切片,分别由内部和外部机构完成染色与扫描。截止日期后的患者构成时间验证组(n=1350,MSI-H占2.3%)。我们采用基于注意力的多实例学习模型,通过H&E WSI预测MSI-H状态。该MSI-H预测模型在内部制备、外部制备和时间验证组上的受试者工作特征曲线下面积值分别达到0.78(95% CI [0.69-0.86])、0.72(95% CI [0.63-0.81])和0.72(95% CI [0.62-0.82])。尽管MSI-H状态与Gleason评分显著相关,该模型在Gleason评分各亚组中仍保持预测能力。总之,我们在包含常规H&E切片的大规模真实世界队列中开发并验证了基于人工智能的MSI-H诊断模型,该模型可有效泛化至外部染色扫描样本及时间独立的验证队列。该算法具有引导前列腺癌患者接受免疫治疗并识别林奇综合征继发MSI-H病例的潜力。