Classic learning theory suggests that proper regularization is the key to good generalization and robustness. In classification, current training schemes only target the complexity of the classifier itself, which can be misleading and ineffective. Instead, we advocate directly measuring the complexity of the decision boundary. Existing literature is limited in this area with few well-established definitions of boundary complexity. As a proof of concept, we start by analyzing ReLU neural networks, whose boundary complexity can be conveniently characterized by the number of affine pieces. With the help of tropical geometry, we develop a novel method that can explicitly count the exact number of boundary pieces, and as a by-product, the exact number of total affine pieces. Numerical experiments are conducted and distinctive properties of our boundary complexity are uncovered. First, the boundary piece count appears largely independent of other measures, e.g., total piece count, and $l_2$ norm of weights, during the training process. Second, the boundary piece count is negatively correlated with robustness, where popular robust training techniques, e.g., adversarial training or random noise injection, are found to reduce the number of boundary pieces.
翻译:经典学习理论表明,恰当的规则化是泛化能力与鲁棒性的关键。在分类任务中,当前训练方案仅关注分类器本身的复杂度,这可能导致误导性或低效。本文主张直接度量决策边界的复杂度。现有文献在此领域存在局限,鲜有明确定义的边界复杂度标准。作为概念验证,我们从分析ReLU神经网络入手,其边界复杂度可通过仿射片段数量便捷表征。借助热带几何方法,我们开发了一种能够显式计算边界片段确切数量的新型方法,并附带输出总仿射片段的确切数量。通过数值实验揭示了边界复杂度的独特性质:首先,训练过程中边界片段计数与总片段计数、权重$l_2$范数等其他度量指标基本无关;其次,边界片段计数与鲁棒性呈负相关,而对抗训练或随机噪声注入等流行的鲁棒训练技术可有效减少边界片段数量。