PAC generalization bounds on the risk, when expressed in terms of the expected loss, are often insufficient to capture imbalances between subgroups in the data. To overcome this limitation, we introduce a new family of risk measures, called constrained f-entropic risk measures, which enable finer control over distributional shifts and subgroup imbalances via f-divergences, and include the Conditional Value at Risk (CVaR), a well-known risk measure. We derive both classical and disintegrated PAC-Bayesian generalization bounds for this family of risks, providing the first disintegratedPAC-Bayesian guarantees beyond standard risks. Building on this theory, we design a self-bounding algorithm that minimizes our bounds directly, yielding models with guarantees at the subgroup level. Finally, we empirically demonstrate the usefulness of our approach.
翻译:当以期望损失表示时,风险上的PAC泛化界通常不足以捕捉数据中不同子组之间的不平衡性。为克服这一局限,我们引入了一类新的风险度量,称为约束f-熵风险度量,其通过f-散度实现对分布偏移与子组不平衡的更精细控制,并涵盖了条件风险价值(CVaR)这一经典风险度量。我们为此类风险推导了经典与分解形式的PAC-Bayesian泛化界,首次在标准风险度量之外建立了分解形式的PAC-Bayesian理论保证。基于该理论,我们设计了一种自约束算法直接优化所得界,从而获得具有子组层面理论保证的模型。最后,我们通过实验验证了所提方法的有效性。