Active learning aims to select optimal samples for labeling, minimizing annotation costs. This paper introduces a unified representation learning framework tailored for active learning with task awareness. It integrates diverse sources, comprising reconstruction, adversarial, self-supervised, knowledge-distillation, and classification losses into a unified VAE-based ADROIT approach. The proposed approach comprises three key components - a unified representation generator (VAE), a state discriminator, and a (proxy) task-learner or classifier. ADROIT learns a latent code using both labeled and unlabeled data, incorporating task-awareness by leveraging labeled data with the proxy classifier. Unlike previous approaches, the proxy classifier additionally employs a self-supervised loss on unlabeled data and utilizes knowledge distillation to align with the target task-learner. The state discriminator distinguishes between labeled and unlabeled data, facilitating the selection of informative unlabeled samples. The dynamic interaction between VAE and the state discriminator creates a competitive environment, with the VAE attempting to deceive the discriminator, while the state discriminator learns to differentiate between labeled and unlabeled inputs. Extensive evaluations on diverse datasets and ablation analysis affirm the effectiveness of the proposed model.
翻译:主动学习旨在选择最优样本进行标注,以最小化标注成本。本文提出了一种专为任务感知的主动学习设计的统一表征学习框架。该框架将多种损失源——包括重构损失、对抗损失、自监督损失、知识蒸馏损失和分类损失——集成到一个基于变分自编码器的统一ADROIT方法中。所提出的方法包含三个关键组件:一个统一的表征生成器、一个状态判别器以及一个(代理)任务学习器或分类器。ADROIT利用已标注和未标注数据学习一个潜在编码,并通过代理分类器利用已标注数据来融入任务感知。与以往方法不同,该代理分类器额外对未标注数据采用自监督损失,并利用知识蒸馏来与目标任务学习器对齐。状态判别器用于区分已标注和未标注数据,从而促进信息丰富的未标注样本的选择。变分自编码器与状态判别器之间的动态交互创造了一个竞争环境:变分自编码器试图欺骗判别器,而状态判别器则学习区分已标注和未标注输入。在多种数据集上的广泛评估以及消融分析证实了所提模型的有效性。