We introduce the Similarity-Distance-Magnitude (SDM) activation function, a more robust and interpretable formulation of the standard softmax activation function, adding Similarity (i.e., correctly predicted depth-matches into training) awareness and Distance-to-training-distribution awareness to the existing output Magnitude (i.e., decision-boundary) awareness, and enabling interpretability-by-exemplar via dense matching. We further introduce the SDM estimator, based on a data-driven partitioning of the class-wise empirical CDFs via the SDM activation, to control the class- and prediction-conditional accuracy among selective classifications. When used as the final-layer activation over pre-trained language models for selective classification, the SDM estimator is more robust to co-variate shifts and out-of-distribution inputs than existing calibration methods using softmax activations, while remaining informative over in-distribution data.
翻译:本文提出相似性-距离-幅度(SDM)激活函数,作为标准softmax激活函数的一种更鲁棒且可解释的改进形式。该函数在现有输出幅度(即决策边界)感知的基础上,增加了相似性感知(即将正确预测的深度匹配纳入训练)以及对训练分布距离的感知,并通过密集匹配实现了基于范例的可解释性。我们进一步提出了SDM估计器,该估计器基于通过SDM激活对类别经验累积分布函数进行数据驱动的划分,用于控制选择性分类中的类别条件与预测条件准确率。当在预训练语言模型最后一层作为激活函数用于选择性分类时,SDM估计器相比使用softmax激活的现有校准方法,对协变量偏移和分布外输入具有更强的鲁棒性,同时在分布内数据上仍保持信息有效性。