This paper focuses on the task of Extreme Multi-Label Classification (XMC) whose goal is to predict multiple labels for each instance from an extremely large label space. While existing research has primarily focused on fully supervised XMC, real-world scenarios often lack complete supervision signals, highlighting the importance of zero-shot settings. Given the large label space, utilizing in-context learning approaches is not trivial. We address this issue by introducing In-Context Extreme Multilabel Learning (ICXML), a two-stage framework that cuts down the search space by generating a set of candidate labels through incontext learning and then reranks them. Extensive experiments suggest that ICXML advances the state of the art on two diverse public benchmarks.
翻译:本文聚焦于极端多标签分类任务,其目标是从极大标签空间中为每个实例预测多个标签。现有研究主要关注全监督式极端多标签分类,但现实场景中往往缺乏完整监督信号,凸显了零样本设置的重要性。由于标签空间巨大,直接应用上下文学习方法并非易事。我们通过提出极端多标签上下文学习框架来解决该问题——该两阶段框架首先通过上下文学习生成候选标签集以缩减搜索空间,随后对候选标签进行重排序。大量实验表明,ICXML在两个不同公开基准测试上达到了当前最优性能。