Learning algorithms can be significantly improved by routing complex or uncertain inputs to specialized experts, balancing accuracy with computational cost. This approach, known as learning to defer, is essential in domains like natural language generation, medical diagnosis, and computer vision, where an effective deferral can reduce errors at low extra resource consumption. However, the two-stage learning to defer setting, which leverages existing predictors such as a collection of LLMs or other classifiers, often faces challenges due to an expert imbalance problem. This imbalance can lead to suboptimal performance, with deferral algorithms favoring the majority expert. We present a comprehensive study of two-stage learning to defer in expert imbalance settings. We cast the deferral loss optimization as a novel cost-sensitive learning problem over the input-expert domain. We derive new margin-based loss functions and guarantees tailored to this setting, and develop novel algorithms for cost-sensitive learning. Leveraging these results, we design principled deferral algorithms, MILD (Margin-based Imbalanced Learning to Defer), specifically suited for expert imbalance settings. Extensive experiments demonstrate the effectiveness of our approach, showing clear improvements over existing baselines on both image classification and real-world Large Language Model (LLM) routing tasks.
翻译:学习算法可通过将复杂或不确定的输入路由至专业专家以平衡精度与计算成本,从而显著提升性能。这种称为"学习延迟决策"的方法在自然语言生成、医疗诊断和计算机视觉等领域的应用至关重要——有效的延迟决策可在极低额外资源消耗下减少错误。然而,利用现有预测器(如大规模语言模型集合或其他分类器)的两阶段延迟决策框架常面临专家不平衡问题的挑战。这种不平衡会导致延迟决策算法倾向多数专家,致使性能次优。本文针对专家不平衡场景下的两阶段延迟决策问题展开系统性研究。我们将延迟决策损失优化重构为输入-专家域上的新型代价敏感学习问题,推导了适配该场景的基于间隔的损失函数及其保障机制,并开发了新型代价敏感学习算法。基于这些成果,我们设计了专门适用于专家不平衡场景的原则性延迟决策算法MILD(基于间隔的不平衡延迟学习)。大量实验表明,该方法在图像分类和真实大规模语言模型路由任务中均能显著超越现有基线方法。