Despite the advances in machine learning and digital pathology, it is not yet clear if machine learning methods can accurately predict molecular information merely from histomorphology. In a quest to answer this question, we built a large-scale dataset (185538 images) with reliable measurements for Ki67, ER, PR, and HER2 statuses. The dataset is composed of mirrored images of H\&E and corresponding images of immunohistochemistry (IHC) assays (Ki67, ER, PR, and HER2. These images are mirrored through registration. To increase reliability, individual pairs were inspected and discarded if artifacts were present (tissue folding, bubbles, etc). Measurements for Ki67, ER and PR were determined by calculating H-Score from image analysis. HER2 measurement is based on binary classification: 0 and 1+ (IHC scores representing a negative subset) vs 3+ (IHC score positive subset). Cases with IHC equivocal score (2+) were excluded. We show that a standard ViT-based pipeline can achieve prediction performances around 90% in terms of Area Under the Curve (AUC) when trained with a proper labeling protocol. Finally, we shed light on the ability of the trained classifiers to localize relevant regions, which encourages future work to improve the localizations. Our proposed dataset is publicly available: https://ihc4bc.github.io/
翻译:尽管机器学习与数字病理学取得了进展,但目前尚不清楚机器学习方法能否仅从组织形态学中准确预测分子信息。为探究这一问题,我们构建了一个包含185538张图像的大规模数据集,并提供了Ki67、ER、PR和HER2状态的可靠测量值。该数据集由H&E染色图像及其对应的免疫组织化学(IHC)检测图像(Ki67、ER、PR和HER2)通过配准镜像对齐构成。为提高可靠性,我们对每对图像进行人工检查,剔除存在组织折叠、气泡等伪影的图像。Ki67、ER和PR的测量值基于图像分析计算的H-score确定;HER2测量则采用二元分类:0分和1+(IHC评分阴性亚组)对比3+(IHC评分阳性亚组),排除IHC临界评分(2+)的病例。研究表明,采用标准ViT框架并配合恰当的标注协议训练时,在曲线下面积(AUC)指标上预测性能可达约90%。最后,我们揭示了训练分类器定位相关区域的能力,这为未来改进定位性能的研究提供了方向。本数据集已公开:https://ihc4bc.github.io/