Algorithmic stability is a central notion in learning theory that quantifies the sensitivity of an algorithm to small changes in the training data. If a learning algorithm satisfies certain stability properties, this leads to many important downstream implications, such as generalization, robustness, and reliable predictive inference. Verifying that stability holds for a particular algorithm is therefore an important and practical question. However, recent results establish that testing the stability of a black-box algorithm is impossible, given limited data from an unknown distribution, in settings where the data lies in an uncountably infinite space (such as real-valued data). In this work, we extend this question to examine a far broader range of settings, where the data may lie in any space -- for example, categorical data. We develop a unified framework for quantifying the hardness of testing algorithmic stability, which establishes that across all settings, if the available data is limited then exhaustive search is essentially the only universally valid mechanism for certifying algorithmic stability. Since in practice, any test of stability would naturally be subject to computational constraints, exhaustive search is impossible and so this implies fundamental limits on our ability to test the stability property for a black-box algorithm.
翻译:算法稳定性是学习理论中的一个核心概念,它量化了算法对训练数据微小变化的敏感性。如果一个学习算法满足特定的稳定性条件,则会带来许多重要的下游影响,例如泛化性、鲁棒性和可靠的预测推断。因此,验证特定算法是否满足稳定性是一个重要且实际的问题。然而,最近的研究结果表明,在数据位于不可数无限空间(如实值数据)且仅能从未知分布中获得有限数据的情况下,测试黑盒算法的稳定性是不可能的。在本工作中,我们将此问题扩展到更广泛的场景,其中数据可能位于任意空间——例如分类数据。我们建立了一个用于量化测试算法稳定性难度的统一框架,该框架证明在所有场景下,如果可用数据有限,则穷举搜索本质上是验证算法稳定性的唯一普遍有效机制。由于在实践中,任何稳定性测试自然都会受到计算约束,而穷举搜索是不可能的,因此这意味着我们测试黑盒算法稳定性属性的能力存在根本性限制。