When are Local Queries Useful for Robust Learning?

Distributional assumptions have been shown to be necessary for the robust learnability of concept classes when considering the exact-in-the-ball robust risk and access to random examples by Gourdeau et al. (2019). In this paper, we study learning models where the learner is given more power through the use of local queries, and give the first distribution-free algorithms that perform robust empirical risk minimization (ERM) for this notion of robustness. The first learning model we consider uses local membership queries (LMQ), where the learner can query the label of points near the training sample. We show that, under the uniform distribution, LMQs do not increase the robustness threshold of conjunctions and any superclass, e.g., decision lists and halfspaces. Faced with this negative result, we introduce the local equivalence query ($\mathsf{LEQ}$) oracle, which returns whether the hypothesis and target concept agree in the perturbation region around a point in the training sample, as well as a counterexample if it exists. We show a separation result: on the one hand, if the query radius $\lambda$ is strictly smaller than the adversary's perturbation budget $\rho$, then distribution-free robust learning is impossible for a wide variety of concept classes; on the other hand, the setting $\lambda=\rho$ allows us to develop robust ERM algorithms. We then bound the query complexity of these algorithms based on online learning guarantees and further improve these bounds for the special case of conjunctions. We finish by giving robust learning algorithms for halfspaces on $\{0,1\}^n$ and then obtaining robustness guarantees for halfspaces in $\mathbb{R}^n$ against precision-bounded adversaries.

翻译：分布假设已被证明是在考虑精确球内鲁棒风险及随机样本访问时概念类鲁棒可学习性的必要条件（Gourdeau等人，2019）。本文研究学习者通过利用局部查询获得更强能力的学习模型，并首次提出针对该鲁棒性概念实现鲁棒经验风险最小化（ERM）的无分布算法。我们考虑的第一个学习模型使用局部成员查询（LMQ），即学习者可查询训练样本附近点的标签。我们证明，在均匀分布下，LMQ不会提高合取式及其任何超类（例如判定列表和半空间）的鲁棒性阈值。面对这一负面结果，我们引入局部等价查询（$\mathsf{LEQ}$）预言机，该预言机返回假设与目标概念在训练样本某点扰动区域内是否一致，并在存在反例时提供反例。我们展示了一个分离结果：一方面，若查询半径$\lambda$严格小于攻击者的扰动预算$\rho$，则对于多种概念类无法实现无分布鲁棒学习；另一方面，当$\lambda=\rho$时，我们可开发鲁棒ERM算法。随后我们基于在线学习保证界定了这些算法的查询复杂度，并针对合取式的特例进一步改进了这些界限。最后，我们给出了$\{0,1\}^n$上半空间的鲁棒学习算法，并获得了$\mathbb{R}^n$中半空间对抗精度受限攻击者的鲁棒性保证。