Active Learning aims to optimize performance while minimizing annotation costs by selecting the most informative samples from an unlabelled pool. Traditional uncertainty sampling often leads to sampling bias by choosing similar uncertain samples. We propose an active learning method that utilizes fixed equiangular hyperspherical points as class prototypes, ensuring consistent inter-class separation and robust feature representations. Our approach introduces Maximally Separated Active Learning (MSAL) for uncertainty sampling and a combined strategy (MSAL-D) for incorporating diversity. This method eliminates the need for costly clustering steps, while maintaining diversity through hyperspherical uniformity. We demonstrate strong performance over existing active learning techniques across five benchmark datasets, highlighting the method's effectiveness and integration ease. The code is available on GitHub.
翻译:主动学习旨在通过从未标记池中选择信息量最大的样本来优化性能并最小化标注成本。传统的不确定性采样方法通常因选择相似的不确定样本而导致采样偏差。我们提出一种主动学习方法,该方法利用固定的等角超球面点作为类别原型,确保一致的类间分离和鲁棒的特征表示。我们的方法引入了最大分离主动学习(MSAL)用于不确定性采样,并提出一种结合多样性的混合策略(MSAL-D)。该方法无需昂贵的聚类步骤,同时通过超球面均匀性保持多样性。我们在五个基准数据集上展示了该方法相对于现有主动学习技术的优越性能,突显了其有效性和易于集成的特点。代码已在GitHub上开源。