Machine learning models often need to be robust to noisy input data. The effect of real-world noise (which is often random) on model predictions is captured by a model's local robustness, i.e., the consistency of model predictions in a local region around an input. However, the na\"ive approach to computing local robustness based on Monte-Carlo sampling is statistically inefficient, leading to prohibitive computational costs for large-scale applications. In this work, we develop the first analytical estimators to efficiently compute local robustness of multi-class discriminative models using local linear function approximation and the multivariate Normal CDF. Through the derivation of these estimators, we show how local robustness is connected to concepts such as randomized smoothing and softmax probability. We also confirm empirically that these estimators accurately and efficiently compute the local robustness of standard deep learning models. In addition, we demonstrate these estimators' usefulness for various tasks involving local robustness, such as measuring robustness bias and identifying examples that are vulnerable to noise perturbation in a dataset. By developing these analytical estimators, this work not only advances conceptual understanding of local robustness, but also makes its computation practical, enabling the use of local robustness in critical downstream applications.
翻译:机器学习模型通常需要对含噪输入数据具有鲁棒性。真实世界噪声(通常为随机噪声)对模型预测的影响由模型的局部鲁棒性表征,即输入局部区域内模型预测的一致性。然而,基于蒙特卡洛采样的局部鲁棒性朴素计算方法在统计上低效,导致大规模应用面临高昂计算成本。本研究首次提出分析型估计器,通过局部线性函数近似和多元正态累积分布函数,高效计算多类判别模型的局部鲁棒性。通过推导这些估计器,我们揭示了局部鲁棒性与随机平滑、Softmax概率等概念的内在关联。实验证实,这些估计器能够准确高效地计算标准深度学习模型的局部鲁棒性。此外,我们展示了这些估计器在涉及局部鲁棒性的多项任务中的实用性,例如衡量鲁棒性偏差和识别数据集中易受噪声扰动影响的样本。通过开发分析型估计器,本研究不仅深化了对局部鲁棒性的概念理解,更使其计算变得可行,为关键下游应用中局部鲁棒性的使用奠定了基础。