We provide a complete theory of optimal universal rates for binary classification in the agnostic setting. This extends the realizable-case theory of Bousquet, Hanneke, Moran, van Handel, and Yehudayoff (2021) by removing the realizability assumption on the distribution. We identify a fundamental tetrachotomy of optimal rates: for every concept class, the optimal universal rate of convergence of the excess error rate is one of $e^{-n}$, $e^{-o(n)}$, $o(n^{-1/2})$, or arbitrarily slow. We further identify simple combinatorial structures which determine which of these categories any given concept class falls into.
翻译:我们为无监督设置下的二元分类问题提供了一个完整的通用最优速率理论。这扩展了Bousquet、Hanneke、Moran、van Handel和Yehudayoff(2021)在可实现情况下的理论,通过移除对分布的可实现性假设。我们发现了一个基本的最优速率四分法:对于每个概念类,其超额错误率收敛的通用最优速率属于以下四种之一:$e^{-n}$、$e^{-o(n)}$、$o(n^{-1/2})$,或任意缓慢。我们进一步识别了决定任意给定概念类属于哪一类的简单组合结构。