Most information in our world is organized hierarchically; however, many Deep Learning approaches do not leverage this semantically rich structure. Research suggests that human learning benefits from exploiting the hierarchical structure of information, and intelligent models could similarly take advantage of this through multi-task learning. In this work, we analyze the advantages and limitations of multi-task learning in a hierarchical multi-label classification problem: car make and model classification. Considering both parallel and cascaded multi-task architectures, we evaluate their impact on different Deep Learning classifiers (CNNs, Transformers) while varying key factors such as dropout rate and loss weighting to gain deeper insight into the effectiveness of this approach. The tests are conducted on two established benchmarks: StanfordCars and CompCars. We observe the effectiveness of the multi-task paradigm on both datasets, improving the performance of the investigated CNN in almost all scenarios. Furthermore, the approach yields significant improvements on the CompCars dataset for both types of models.
翻译:现实世界中的大多数信息都以层次化方式组织;然而,许多深度学习方法并未利用这种语义丰富的结构。研究表明,人类学习能够受益于对信息层次结构的利用,而智能模型同样可以通过多任务学习来获取类似优势。在本研究中,我们分析了多任务学习在层次化多标签分类问题——汽车品牌与型号分类中的优势与局限性。通过考察并行与级联两种多任务架构,我们评估了它们对不同深度学习分类器(CNN、Transformer)的影响,同时通过调整关键因素(如丢弃率和损失权重)来更深入地理解该方法的有效性。实验在两个公认基准数据集:StanfordCars 和 CompCars 上进行。我们观察到多任务范式在两个数据集上的有效性,在几乎所有场景中都提升了所研究 CNN 模型的性能。此外,该方法对两类模型在 CompCars 数据集上均产生了显著改进。