Robustness in machine learning is commonly studied in the adversarial setting, yet real-world noise (such as measurement noise) is random rather than adversarial. Model behavior under such noise is captured by average-case robustness, i.e., the probability of obtaining consistent predictions in a local region around an input. However, the na\"ive approach to computing average-case robustness based on Monte-Carlo sampling is statistically inefficient, especially for high-dimensional data, leading to prohibitive computational costs for large-scale applications. In this work, we develop the first analytical estimators to efficiently compute average-case robustness of multi-class discriminative models. These estimators linearize models in the local region around an input and analytically compute the robustness of the resulting linear models. We show empirically that these estimators efficiently compute the robustness of standard deep learning models and demonstrate these estimators' usefulness for various tasks involving robustness, such as measuring robustness bias and identifying dataset samples that are vulnerable to noise perturbation. In doing so, this work not only proposes a new framework for robustness, but also makes its computation practical, enabling the use of average-case robustness in downstream applications.
翻译:机器学习中的鲁棒性通常在对抗性场景下研究,然而现实世界中的噪声(如测量噪声)是随机的而非对抗性的。模型在此类噪声下的行为由平均情况鲁棒性刻画,即输入局部区域内获得一致预测的概率。然而,基于蒙特卡洛采样的平均情况鲁棒性朴素计算方法在统计上效率低下,尤其对于高维数据,导致大规模应用的计算成本过高。本文首次提出分析型估计器,用于高效计算多类判别模型的平均情况鲁棒性。这些估计器在输入局部区域对模型进行线性化,并解析计算所得线性模型的鲁棒性。实验表明,这些估计器能高效计算标准深度学习模型的鲁棒性,并展示了其在涉及鲁棒性的多种任务中的实用性,包括测量鲁棒性偏差和识别易受噪声扰动影响的数据集样本。通过上述工作,本文不仅提出了鲁棒性的新框架,更使其计算变得可行,为平均情况鲁棒性在下游应用中的使用铺平了道路。