Group sparsity in Machine Learning (ML) encourages simpler, more interpretable models with fewer active parameter groups. This work aims to incorporate structured group sparsity into the shared parameters of a Multi-Task Learning (MTL) framework, to develop parsimonious models that can effectively address multiple tasks with fewer parameters while maintaining comparable or superior performance to a dense model. Sparsifying the model during training helps decrease the model's memory footprint, computation requirements, and prediction time during inference. We use channel-wise l1/l2 group sparsity in the shared layers of the Convolutional Neural Network (CNN). This approach not only facilitates the elimination of extraneous groups (channels) but also imposes a penalty on the weights, thereby enhancing the learning of all tasks. We compare the outcomes of single-task and multi-task experiments under group sparsity on two publicly available MTL datasets, NYU-v2 and CelebAMask-HQ. We also investigate how changing the sparsification degree impacts both the performance of the model and the sparsity of groups.
翻译:机器学习中的组稀疏性鼓励通过更少的活跃参数组构建更简洁、更具可解释性的模型。本研究旨在将结构化组稀疏性融入多任务学习框架的共享参数中,以开发精简模型,该模型在使用更少参数的同时,能有效处理多个任务,并保持与密集模型相当或更优的性能。在训练期间对模型进行稀疏化有助于降低模型的内存占用、计算需求及推理时的预测时间。我们在卷积神经网络的共享层中采用通道级l1/l2组稀疏性。该方法不仅有助于消除冗余组(通道),还能对权重施加惩罚,从而增强所有任务的学习效果。我们在两个公开的多任务学习数据集NYU-v2和CelebAMask-HQ上,比较了单任务与多任务实验在组稀疏条件下的结果。同时,我们探究了稀疏化程度的变化如何影响模型性能与组稀疏性。