In this paper, the solution to the empirical risk minimization problem with $f$-divergence regularization (ERM-$f$DR) is presented and conditions under which the solution also serves as the solution to the minimization of the expected empirical risk subject to an $f$-divergence constraint are established. The proposed approach extends applicability to a broader class of $f$-divergences than previously reported and yields theoretical results that recover previously known results. Additionally, the difference between the expected empirical risk of the ERM-$f$DR solution and that of its reference measure is characterized, providing insights into previously studied cases of $f$-divergences. A central contribution is the introduction of the normalization function, a mathematical object that is critical in both the dual formulation and practical computation of the ERM-$f$DR solution. This work presents an implicit characterization of the normalization function as a nonlinear ordinary differential equation (ODE), establishes its key properties, and subsequently leverages them to construct a numerical algorithm for approximating the normalization factor under mild assumptions. Further analysis demonstrates structural equivalences between ERM-$f$DR problems with different $f$-divergences via transformations of the empirical risk. Finally, the proposed algorithm is used to compute the training and test risks of ERM-$f$DR solutions under different $f$-divergence regularizers. This numerical example highlights the practical implications of choosing different functions $f$ in ERM-$f$DR problems.
翻译:本文提出了基于$f$-散度正则化的经验风险最小化问题(ERM-$f$DR)的求解方法,并建立了该解同时满足$f$-散度约束下期望经验风险最小化问题的条件。所提出的方法将适用范围扩展到比以往文献更广泛的$f$-散度类别,其理论结果可涵盖既往已知结论。此外,本文刻画了ERM-$f$DR解的期望经验风险与其参考测度对应值之间的差异,从而为先前研究的$f$-散度案例提供了新的理论视角。核心贡献在于引入了归一化函数这一数学对象,该函数在ERM-$f$DR问题的对偶形式与数值计算中均具有关键作用。本研究通过非线性常微分方程(ODE)隐式表征了归一化函数,建立了其基本性质,并在此基础上构建了在温和假设条件下逼近归一化因子的数值算法。进一步分析表明,通过经验风险的变换,不同$f$-散度对应的ERM-$f$DR问题具有结构等价性。最后,利用所提算法计算了不同$f$-散度正则化下ERM-$f$DR解的训练风险与测试风险。数值算例揭示了ERM-$f$DR问题中选取不同$f$函数所产生的实际影响。