Machine learning algorithms are often used in environments which are not captured accurately even by the most carefully obtained training data, either due to the possibility of `adversarial' test-time attacks, or on account of `natural' distribution shift. For test-time attacks, we introduce and analyze a novel robust reliability guarantee, which requires a learner to output predictions along with a reliability radius $\eta$, with the meaning that its prediction is guaranteed to be correct as long as the adversary has not perturbed the test point farther than a distance $\eta$. We provide learners that are optimal in the sense that they always output the best possible reliability radius on any test point, and we characterize the reliable region, i.e. the set of points where a given reliability radius is attainable. We additionally analyze reliable learners under distribution shift, where the test points may come from an arbitrary distribution Q different from the training distribution P. For both cases, we bound the probability mass of the reliable region for several interesting examples, for linear separators under nearly log-concave and s-concave distributions, as well as for smooth boundary classifiers under smooth probability distributions.
翻译:机器学习算法通常部署在即使最精心获取的训练数据也无法准确表征的环境中,这源于"对抗性"测试时攻击的可能性,或"自然"分布漂移的存在。针对测试时攻击,我们提出并分析了一种新型鲁棒可靠性保证机制:要求学习器输出预测结果的同时提供可靠性半径 $\eta$,其含义是当攻击者对测试点的扰动距离不超过 $\eta$ 时,该预测结果必然正确。我们构建了在任意测试点上始终输出最优可能可靠性半径的优化学习器,并刻画了可靠区域(即给定可靠性半径可达的样本点集合)。进一步,我们分析了分布漂移场景下的可靠学习器——此时测试点可能来自于与训练分布P不同的任意分布Q。针对两类情形,我们为多个典型案例界定了可靠区域的概率质量:包括近对数凹分布与s-凹分布下的线性分类器,以及光滑概率分布下的光滑边界分类器。