In the fast-evolving field of artificial intelligence, where models are increasingly growing in complexity and size, the availability of labeled data for training deep learning models has become a significant challenge. Addressing complex problems like object detection demands considerable time and resources for data labeling to achieve meaningful results. For companies developing such applications, this entails extensive investment in highly skilled personnel or costly outsourcing. This research work aims to demonstrate that enhancing feature extractors can substantially alleviate this challenge, enabling models to learn more effective representations with less labeled data. Utilizing a self-supervised learning strategy, we present a model trained on unlabeled data that outperforms state-of-the-art feature extractors pre-trained on ImageNet and particularly designed for object detection tasks. Moreover, the results demonstrate that our approach encourages the model to focus on the most relevant aspects of an object, thus achieving better feature representations and, therefore, reinforcing its reliability and robustness.
翻译:在人工智能快速发展的领域中,模型日益复杂和庞大,用于训练深度学习模型的标注数据可用性已成为一项重大挑战。解决诸如目标检测这类复杂问题需要投入大量时间和资源进行数据标注才能获得有意义的结果。对于开发此类应用的公司而言,这意味着需要对高技能人员或昂贵的外包服务进行大量投资。本研究工作旨在证明,增强特征提取器可以显著缓解这一挑战,使模型能够用更少的标注数据学习更有效的表示。通过采用自监督学习策略,我们提出了一个在未标注数据上训练的模型,其性能优于在ImageNet上预训练且专为目标检测任务设计的最先进特征提取器。此外,结果表明我们的方法能够促使模型专注于物体最相关的特征,从而获得更好的特征表示,进而增强其可靠性和鲁棒性。