The effect of relative entropy asymmetry is analyzed in the context of empirical risk minimization (ERM) with relative entropy regularization (ERM-RER). Two regularizations are considered: $(a)$ the relative entropy of the measure to be optimized with respect to a reference measure (Type-I ERM-RER); or $(b)$ the relative entropy of the reference measure with respect to the measure to be optimized (Type-II ERM-RER). The main result is the characterization of the solution to the Type-II ERM-RER problem and its key properties. By comparing the well-understood Type-I ERM-RER with Type-II ERM-RER, the effects of entropy asymmetry are highlighted. The analysis shows that in both cases, regularization by relative entropy forces the solution's support to collapse into the support of the reference measure, introducing a strong inductive bias that can overshadow the evidence provided by the training data. Finally, it is shown that Type-II regularization is equivalent to Type-I regularization with an appropriate transformation of the empirical risk function.
翻译:本文分析了相对熵不对称性在采用相对熵正则化的经验风险最小化(ERM-RER)问题中的影响。考虑了两种正则化形式:$(a)$ 待优化测度相对于参考测度的相对熵(Type-I ERM-RER);或$(b)$ 参考测度相对于待优化测度的相对熵(Type-II ERM-RER)。主要结果是给出了Type-II ERM-RER问题的解及其关键性质的刻画。通过将已充分理解的Type-I ERM-RER与Type-II ERM-RER进行比较,凸显了熵不对称性的影响。分析表明,在两种情况下,相对熵正则化都会迫使解的支撑集坍缩到参考测度的支撑集内,从而引入强烈的归纳偏置,这种偏置可能压倒训练数据所提供的证据。最后,证明了Type-II正则化等价于对经验风险函数进行适当变换后的Type-I正则化。