Supervised deep learning models require significant amount of labeled data to achieve an acceptable performance on a specific task. However, when tested on unseen data, the models may not perform well. Therefore, the models need to be trained with additional and varying labeled data to improve the generalization. In this work, our goal is to understand the models, their performance and generalization. We establish image-image, dataset-dataset, and image-dataset distances to gain insights into the model's behavior. Our proposed distance metric when combined with model performance can help in selecting an appropriate model/architecture from a pool of candidate architectures. We have shown that the generalization of these models can be improved by only adding a small number of unseen images (say 1, 3 or 7) into the training set. Our proposed approach reduces training and annotation costs while providing an estimate of model performance on unseen data in dynamic environments.
翻译:监督式深度学习模型需要大量标注数据才能在特定任务上达到可接受的性能。然而,当在未见数据上测试时,模型可能表现不佳。因此,需要利用额外的多样化标注数据训练模型以改进其泛化能力。本研究旨在理解模型、其性能及泛化机制。我们建立了图像-图像、数据集-数据集以及图像-数据集间的距离度量,以洞察模型行为。所提出的距离指标结合模型性能,有助于从候选架构池中选取合适的模型/架构。我们证明,仅向训练集中添加少量未见图像(如1、3或7张)即可提升模型的泛化能力。该方法在降低训练与标注成本的同时,能动态环境下对模型在未见数据上的性能进行预估。