Cross-dataset domain adaptation for the classification COVID-19 using chest computed tomography images

Detecting COVID-19 patients using Computed Tomography (CT) images of the lungs is an active area of research. Datasets of CT images from COVID-19 patients are becoming available. Deep learning (DL) solutions and in particular Convolutional Neural Networks (CNN) have achieved impressive results for the classification of COVID-19 CT images, but only when the training and testing take place within the same dataset. Work on the cross-dataset problem is still limited and the achieved results are low. Our work tackles the cross-dataset problem through a Domain Adaptation (DA) technique with deep learning. Our proposed solution, COVID19-DANet, is based on pre-trained CNN backbone for feature extraction. For this task, we select the pre-trained Efficientnet-B3 CNN because it has achieved impressive classification accuracy in previous work. The backbone CNN is followed by a prototypical layer which is a concept borrowed from prototypical networks in few-shot learning (FSL). It computes a cosine distance between given samples and the class prototypes and then converts them to class probabilities using the Softmax function. To train the COVID19-DANet model, we propose a combined loss function that is composed of the standard cross-entropy loss for class discrimination and another entropy loss computed over the unlabelled target set only. This so-called unlabelled target entropy loss is minimized and maximized in an alternative fashion, to reach the two objectives of class discrimination and domain invariance. COVID19-DANet is tested under four cross-dataset scenarios using the SARS-CoV-2-CT and COVID19-CT datasets and has achieved encouraging results compared to recent work in the literature.

翻译：利用肺部计算机断层扫描（CT）图像检测COVID-19患者是一个活跃的研究领域。来自COVID-19患者的CT图像数据集正逐渐可用。深度学习（DL）解决方案，特别是卷积神经网络（CNN），在COVID-19 CT图像分类中取得了令人瞩目的成果，但仅限于训练和测试在同一数据集内部进行的情况下。针对跨数据集问题的研究仍然有限，且取得的成果较低。我们的工作通过结合深度学习的域自适应（DA）技术来解决跨数据集问题。所提出的解决方案COVID19-DANet基于预训练的CNN主干网络进行特征提取。为此，我们选择了预训练的Efficientnet-B3 CNN，因为它在先前工作中取得了显著分类精度。CNN主干网络之后是一个原型层，这一概念借鉴自少样本学习（FSL）中的原型网络。该层计算给定样本与类别原型之间的余弦距离，然后通过Softmax函数将其转换为类别概率。为训练COVID19-DANet模型，我们提出了一种组合损失函数，它由用于类别区分的标准交叉熵损失和仅在未标注目标集上计算的另一个熵损失组成。这种所谓的未标注目标熵损失以交替方式最小化和最大化，以实现类别区分和域不变性这两个目标。使用SARS-CoV-2-CT和COVID19-CT数据集，在四种跨数据集场景下对COVID19-DANet进行了测试，与文献中的近期工作相比，取得了令人鼓舞的成果。