We introduce a new AI-ready computational pathology dataset containing restained and co-registered digitized images from eight head-and-neck squamous cell carcinoma patients. Specifically, the same tumor sections were stained with the expensive multiplex immunofluorescence (mIF) assay first and then restained with cheaper multiplex immunohistochemistry (mIHC). This is a first public dataset that demonstrates the equivalence of these two staining methods which in turn allows several use cases; due to the equivalence, our cheaper mIHC staining protocol can offset the need for expensive mIF staining/scanning which requires highly-skilled lab technicians. As opposed to subjective and error-prone immune cell annotations from individual pathologists (disagreement > 50%) to drive SOTA deep learning approaches, this dataset provides objective immune and tumor cell annotations via mIF/mIHC restaining for more reproducible and accurate characterization of tumor immune microenvironment (e.g. for immunotherapy). We demonstrate the effectiveness of this dataset in three use cases: (1) IHC quantification of CD3/CD8 tumor-infiltrating lymphocytes via style transfer, (2) virtual translation of cheap mIHC stains to more expensive mIF stains, and (3) virtual tumor/immune cellular phenotyping on standard hematoxylin images. The dataset is available at \url{https://github.com/nadeemlab/DeepLIIF}.
翻译:我们介绍了一个新的面向AI就绪的计算病理学数据集,包含来自八名头颈部鳞状细胞癌患者的重新染色并配准的数字图像。具体而言,同一肿瘤切片首先使用昂贵的多重免疫荧光(mIF)检测进行染色,然后使用较便宜的多重免疫组织化学(mIHC)进行再染色。这是首个公开的数据集,证明这两种染色方法具有等价性,从而支持多种应用场景:由于这种等价性,我们成本较低的mIHC染色方案可以替代对昂贵mIF染色/扫描的需求(后者需要高技能实验室技术人员)。与依赖单个病理学家的主观且易出错的免疫细胞注释(不一致性>50%)来驱动最先进的深度学习方法不同,本数据集通过mIF/mIHC再染色提供客观的免疫细胞和肿瘤细胞注释,从而更可重复、更精准地表征肿瘤免疫微环境(例如用于免疫治疗)。我们在三个应用场景中展示了该数据集的有效性:(1)通过风格迁移对CD3/CD8肿瘤浸润淋巴细胞进行IHC定量,(2)将廉价的mIHC染色虚拟转换为更昂贵的mIF染色,以及(3)在标准苏木精图像上进行虚拟肿瘤/免疫细胞表型分析。该数据集可在\url{https://github.com/nadeemlab/DeepLIIF}获取。