In recent years, we have witnessed a considerable increase in performance in image classification tasks. This performance improvement is mainly due to the adoption of deep learning techniques. Generally, deep learning techniques demand a large set of annotated data, making it a challenge when applying it to small datasets. In this scenario, transfer learning strategies have become a promising alternative to overcome these issues. This work aims to compare the performance of different pre-trained neural networks for feature extraction in image classification tasks. We evaluated 16 different pre-trained models in four image datasets. Our results demonstrate that the best general performance along the datasets was achieved by CLIP-ViT-B and ViT-H-14, where the CLIP-ResNet50 model had similar performance but with less variability. Therefore, our study provides evidence supporting the choice of models for feature extraction in image classification tasks.
翻译:近年来,图像分类任务的性能显著提升,这一性能改善主要归功于深度学习技术的采用。然而,深度学习技术通常需要大量标注数据,这使得其在处理小规模数据集时面临挑战。在此背景下,迁移学习策略已成为解决这些问题的有效替代方案。本研究旨在比较不同预训练神经网络在图像分类任务中用于特征提取的性能表现。我们基于四个图像数据集评估了16种不同的预训练模型。结果表明,CLIP-ViT-B和ViT-H-14在整体数据集上取得了最优性能,而CLIP-ResNet50模型性能相近且波动性更小。因此,本研究为图像分类任务中特征提取的模型选择提供了实证依据。