Deep Learning Segmentation and Classification of Red Blood Cells Using a Large Multi-Scanner Dataset

Digital pathology has recently been revolutionized by advancements in artificial intelligence, deep learning, and high-performance computing. With its advanced tools, digital pathology can help improve and speed up the diagnostic process, reduce human errors, and streamline the reporting step. In this paper, we report a new large red blood cell (RBC) image dataset and propose a two-stage deep learning framework for RBC image segmentation and classification. The dataset is a highly diverse dataset of more than 100K RBCs containing eight different classes. The dataset, which is considerably larger than any publicly available hematopathology dataset, was labeled independently by two hematopathologists who also manually created masks for RBC cell segmentation. Subsequently, in the proposed framework, first, a U-Net model was trained to achieve automatic RBC image segmentation. Second, an EfficientNetB0 model was trained to classify RBC images into one of the eight classes using a transfer learning approach with a 5X2 cross-validation scheme. An IoU of 98.03% and an average classification accuracy of 96.5% were attained on the test set. Moreover, we have performed experimental comparisons against several prominent CNN models. These comparisons show the superiority of the proposed model with a good balance between performance and computational cost.

翻译：数字病理学近年来因人工智能、深度学习及高性能计算的进步而实现革新。借助其先进工具，数字病理学能够改善并加速诊断流程、减少人为错误，并简化报告环节。本文报告了一个新的大型红细胞图像数据集，并提出一种用于红细胞图像分割与分类的两阶段深度学习框架。该数据集包含超过10万个红细胞，涵盖八种不同类别，规模远超任何公开的血液病理学数据集。数据由两位血液病理学家独立标注，并手工创建了红细胞分割掩膜。随后在提出的框架中，首先训练U-Net模型实现自动红细胞图像分割，其次采用迁移学习策略，通过5×2交叉验证方案训练EfficientNetB0模型，将红细胞图像分为八类。测试集上交并比达98.03%，平均分类准确率达96.5%。此外，我们与多个主流卷积神经网络模型进行了实验对比，结果表明所提模型在性能与计算成本之间达到了良好平衡，具有显著优越性。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日