The separation of performance metrics from gradient based loss functions may not always give optimal results and may miss vital aggregate information. This paper investigates incorporating a performance metric alongside differentiable loss functions to inform training outcomes. The goal is to guide model performance and interpretation by assuming statistical distributions on this performance metric for dynamic weighting. The focus is on van Rijsbergens $F_{\beta}$ metric -- a popular choice for gauging classification performance. Through distributional assumptions on the $F_{\beta}$, an intermediary link can be established to the standard binary cross-entropy via dynamic penalty weights. First, the $F_{\beta}$ metric is reformulated to facilitate assuming statistical distributions with accompanying proofs for the cumulative density function. These probabilities are used within a knee curve algorithm to find an optimal $\beta$ or $\beta_{opt}$. This $\beta_{opt}$ is used as a weight or penalty in the proposed weighted binary cross-entropy. Experimentation on publicly available data with imbalanced classes mostly yields better and interpretable results as compared to the baseline. For example, for the IMDB text data with known labeling errors, a 14% boost is shown. This methodology can provide better interpretation.
翻译:将性能度量与基于梯度的损失函数分离可能并不总能得到最优结果,且可能丢失重要的聚合信息。本文研究将性能度量与可微损失函数相结合,以指导训练结果。目标是通过对性能度量进行统计分布假设以实现动态加权,从而引导模型性能与解释。重点关注 van Rijsberg 的 $F_{\beta}$ 度量——一种衡量分类性能的常用指标。通过对 $F_{\beta}$ 进行分布假设,可以建立其与标准二元交叉熵之间通过动态惩罚权重相连接的中间桥梁。首先,对 $F_{\beta}$ 度量进行重新表述,以便于进行统计分布假设,并给出累积分布函数的相关证明。这些概率用于拐点算法中,以寻找最优的 $\beta$ 或 $\beta_{opt}$。该 $\beta_{opt}$ 作为权重或惩罚项用于所提出的加权二元交叉熵中。在公开的类别不平衡数据上进行实验,与基准方法相比,结果大多表现更优且更具可解释性。例如,在已知标注错误的 IMDB 文本数据上,性能提升了 14%。该方法可提供更好的解释性。