Prior-data fitted networks (PFNs) have achieved exceptional performance on tabular classification tasks. However, like other classifiers, their performance can suffer under the effect of class imbalance, resulting in poor performance for rare classes. Several techniques exist which attempt to mitigate the deleterious effect of class imbalance on classification performance, but the in-context learning (ICL) dynamic of PFNs means that loss-based strategies are impossible, and other techniques are unproven. We have adapted several classical techniques addressing class imbalance and analyzed their performance on PFN classification. We observe that thresholding performs exceptionally well because of the calibration characteristics of PFNs, and downsampling performs comparably because of PFNs exceptional limited-data performance, with the additional benefit of reduced computation cost for inference.
翻译:先验数据拟合网络在表格分类任务中已取得卓越性能。然而,与其他分类器类似,其性能在类别不平衡的影响下可能下降,导致对稀有类别的表现不佳。现有多种技术试图减轻类别不平衡对分类性能的不利影响,但先验数据拟合网络的上下文学习动态使得基于损失函数的策略不可行,而其他技术尚未得到验证。我们调整了若干处理类别不平衡的经典技术,并分析其在先验数据拟合网络分类中的表现。我们观察到,由于先验数据拟合网络的校准特性,阈值调整方法表现尤为出色;而由于先验数据拟合网络在有限数据下的卓越性能,降采样方法表现相当,且额外具有降低推理计算成本的优势。