Fossil Image Identification using Deep Learning Ensembles of Data Augmented Multiviews

Identification of fossil species is crucial to evolutionary studies. Recent advances from deep learning have shown promising prospects in fossil image identification. However, the quantity and quality of labeled fossil images are often limited due to fossil preservation, conditioned sampling, and expensive and inconsistent label annotation by domain experts, which pose great challenges to the training of deep learning based image classification models. To address these challenges, we follow the idea of the wisdom of crowds and propose a novel multiview ensemble framework, which collects multiple views of each fossil specimen image reflecting its different characteristics to train multiple base deep learning models and then makes final decisions via soft voting. We further develop OGS method that integrates original, gray, and skeleton views under this framework to demonstrate the effectiveness. Experimental results on the fusulinid fossil dataset over five deep learning based milestone models show that OGS using three base models consistently outperforms the baseline using a single base model, and the ablation study verifies the usefulness of each selected view. Besides, OGS obtains the superior or comparable performance compared to the method under well-known bagging framework. Moreover, as the available training data decreases, the proposed framework achieves more performance gains compared to the baseline. Furthermore, a consistency test with two human experts shows that OGS obtains the highest agreement with both the labels of dataset and the two experts. Notably, this methodology is designed for general fossil identification and it is expected to see applications on other fossil datasets. The results suggest the potential application when the quantity and quality of labeled data are particularly restricted, e.g., to identify rare fossil images.

翻译：化石物种的识别对于进化研究至关重要。近年来深度学习的进展在化石图像识别领域展现出令人期待的前景。然而，由于化石保存状况、条件性采样以及领域专家标注昂贵且不一致等原因，标记化石图像的数量和质量往往受限，这给基于深度学习的图像分类模型训练带来了巨大挑战。为应对这些挑战，我们借鉴群体智慧的思想，提出了一种新颖的多视图集成框架。该框架通过采集每块化石标本图像反映其不同特征的多个视图，训练多个基础深度学习模型，并通过软投票做出最终决策。我们进一步在此框架下开发了OGS方法（融合原始视图、灰度视图和骨架视图）以证明其有效性。在基于五个深度学习里程碑模型的䗴类化石数据集上的实验结果表明，使用三个基础模型的OGS方法始终优于使用单个基础模型的基线方法，消融研究验证了每个选定视图的有效性。此外，与著名的Bagging框架下的方法相比，OGS方法取得了更优或相当的性能。同时，随着可用训练数据的减少，所提框架相比于基线方法获得了更大的性能提升。与两位人类专家的一致性检验表明，OGS方法与数据集标签及两位专家的标签均具有最高一致性。值得注意的是，该方法旨在实现通用化石识别，有望在其他化石数据集上得到应用。研究结果揭示了在标记数据数量和质量特别受限的情况下（例如稀有化石图像识别）的潜在应用价值。