We introduce the Similarity-Distance-Magnitude (SDM) activation function, a more robust and interpretable formulation of the standard softmax activation function, adding Similarity (i.e., correctly predicted depth-matches into training) awareness and Distance-to-training-distribution awareness to the existing output Magnitude (i.e., decision-boundary) awareness, and enabling interpretability-by-exemplar via dense matching. We further introduce the SDM estimator, based on a data-driven partitioning of the class-wise empirical CDFs via the SDM activation, to control the class- and prediction-conditional accuracy among selective classifications. When used as the final-layer activation over pre-trained language models for selective classification, the SDM estimator is more robust to covariate shifts and out-of-distribution inputs than existing calibration methods using softmax activations, while remaining informative over in-distribution data.
翻译:我们提出相似度-距离-幅度(SDM)激活函数,它是标准Softmax激活函数的一种更鲁棒且更具可解释性的变体。该函数在现有输出幅度(即决策边界)感知能力的基础上,新增了相似性(即正确预测与训练数据的深度匹配)感知能力以及与训练分布距离的感知能力,并通过密集匹配实现了基于样本的可解释性。我们进一步引入了基于SDM激活函数的类经验累积分布函数数据驱动划分的SDM估计器,以控制选择性分类中类别和预测条件准确率。当将该估计器用作预训练语言模型在选择性分类任务中的最后一层激活函数时,相比于使用Softmax激活函数的现有校准方法,它对协变量漂移和分布外输入具有更强的鲁棒性,同时能有效保留分布内数据的信息量。