The deployment of safe and trustworthy machine learning systems, and particularly complex black box neural networks, in real-world applications requires reliable and certified guarantees on their performance. The conformal prediction framework offers such formal guarantees by transforming any point into a set predictor with valid, finite-set, guarantees on the coverage of the true at a chosen level of confidence. Central to this methodology is the notion of the nonconformity score function that assigns to each example a measure of ''strangeness'' in comparison with the previously seen observations. While the coverage guarantees are maintained regardless of the nonconformity measure, the point predictor and the dataset, previous research has shown that the performance of a conformal model, as measured by its efficiency (the average size of the predicted sets) and its informativeness (the proportion of prediction sets that are singletons), is influenced by the choice of the nonconformity score function. The current work introduces the Penalized Inverse Probability (PIP) nonconformity score, and its regularized version RePIP, that allow the joint optimization of both efficiency and informativeness. Through toy examples and empirical results on the task of crop and weed image classification in agricultural robotics, the current work shows how PIP-based conformal classifiers exhibit precisely the desired behavior in comparison with other nonconformity measures and strike a good balance between informativeness and efficiency.
翻译:在现实世界应用中部署安全可信的机器学习系统,特别是复杂的黑盒神经网络,需要对其性能提供可靠且可验证的保证。一致性预测框架通过将任意点预测器转化为集合预测器,为真实标签在选定置信水平下的覆盖范围提供有效的有限集保证,从而满足这一需求。该方法的核心在于非一致性评分函数的概念,该函数通过将每个样本与先前观测数据进行比较,为其分配一个“异常性”度量值。虽然覆盖保证不受非一致性度量、点预测器及数据集选择的影响,但先前研究表明,一致性模型的性能——通过其效率(预测集合的平均大小)和信息量(单元素预测集合的比例)来衡量——会受到非一致性评分函数选择的影响。本研究提出惩罚逆概率非一致性评分及其正则化版本RePIP,能够实现对效率与信息量的联合优化。通过农业机器人领域中作物与杂草图像分类任务的示例验证与实证结果,本研究表明相较于其他非一致性度量方法,基于PIP的一致性分类器能精确展现预期行为,并在信息量与效率之间实现良好平衡。