In this work, we study the implications of the implicit bias of gradient flow on generalization and adversarial robustness in ReLU networks. We focus on a setting where the data consists of clusters and the correlations between cluster means are small, and show that in two-layer ReLU networks gradient flow is biased towards solutions that generalize well, but are highly vulnerable to adversarial examples. Our results hold even in cases where the network has many more parameters than training examples. Despite the potential for harmful overfitting in such overparameterized settings, we prove that the implicit bias of gradient flow prevents it. However, the implicit bias also leads to non-robust solutions (susceptible to small adversarial $\ell_2$-perturbations), even though robust networks that fit the data exist.
翻译:本文研究了梯度流的隐式偏见对ReLU网络泛化性与对抗鲁棒性的影响。我们聚焦于数据由聚类构成且聚类均值间相关性较小的场景,证明了在两层ReLU网络中,梯度流偏向于泛化良好但极易受对抗样本攻击的解。即便网络参数远多于训练样本,该结论依然成立。尽管此类过参数化场景存在有害过拟合的可能,我们证明了梯度流的隐式偏见能够避免该问题。然而,即使存在能拟合数据的鲁棒网络,这种隐式偏见仍会导致非鲁棒的解(易受微小对抗性$\ell_2$扰动影响)。