One prominent method of evaluating machine learning model trustworthiness is the notion of calibration. In the binary outcome setting, a probabilistic predictor is calibrated if outcomes are realized according to a model's distributional prediction, conditioned on this prediction. Straightforward extensions of binary calibration definitions to probabilistic multiclass classifiers suffer from an exponential complexity blowup as the space of predictions grows exponentially in the number of classes $n$. As a remedy, Noarov and Roth (2023) propose multiclass calibration with predictions that are properties of the outcome distribution, reducing complexity from growing in the number of classes $n$ to the dimension $d$ of the property, called its elicitation complexity. Previous work on approximate property calibration is generally limited to continuous scalar properties, despite many relevant properties of interest being discrete, like the mode or rankings. We characterize the approximate property calibration of discrete properties which are strongly orderable by using Lipschitz continuous properties as an intermediary. This work is the first to our knowledge to provide approximate calibration results for discrete properties. Along the way, we characterize the Lipschitz elicitation complexity of strongly orderable discrete properties by constructing algorithms for designing these Lipschitz properties, which we prove can be post-processed to obtain the original discrete property.
翻译:评估机器学习模型可信度的一个常见方法是校准。在二元结果设定下,若结果根据模型分布预测实现,且该预测以条件为前提,则概率预测器是校准的。将二元校准定义直接扩展至概率多类分类器时,由于预测空间随类别数$n$呈指数增长,会导致指数级复杂度爆炸。为解决此问题,Noarov与Roth(2023)提出利用结果分布属性进行多类校准,将复杂度从随类别数$n$增长降低至属性维度$d$(即其引述复杂度)。先前关于近似属性校准的研究通常局限于连续标量属性,而许多相关属性(如众数或排序)本质上是离散的。本文通过以利普希茨连续属性为中介,刻画了强可排序离散属性的近似属性校准。据我们所知,这是首个为离散属性提供近似校准结果的研究。在此过程中,我们通过构造设计这些利普希茨属性的算法,刻画了强可排序离散属性的利普希茨引述复杂度,并证明这些属性可通过后处理获得原始离散属性。