We study the supervised training dynamics of neural classifiers through the lens of binary hypothesis testing. We model classification as a set of binary tests between class-conditional distributions of representations and empirically show that, along training trajectories, well-generalizing networks increasingly align with Neyman-Pearson optimal decision rules via monotonic improvements in KL divergence that relate to error rate exponents. We finally discuss how this yields an explanation and possible training or regularization strategies for different classes of neural networks.
翻译:我们通过二元假设检验的视角研究神经分类器的监督训练动态。我们将分类建模为表征的类条件分布之间的一系列二元检验,并通过实验证明:在训练轨迹中,泛化能力良好的网络通过KL散度的单调改进(与错误率指数相关)逐渐与Neyman-Pearson最优决策规则对齐。最后,我们讨论了这一发现如何为不同类型的神经网络提供理论解释及可能的训练或正则化策略。