Development of multi-modal, probabilistic prediction models has lead to a need for comprehensive evaluation metrics. While several metrics can characterize the accuracy of machine-learned models (e.g., negative log-likelihood, Jensen-Shannon divergence), these metrics typically operate on probability densities. Applying them to purely sample-based prediction models thus requires that the underlying density function is estimated. However, common methods such as kernel density estimation (KDE) have been demonstrated to lack robustness, while more complex methods have not been evaluated in multi-modal estimation problems. In this paper, we present ROME (RObust Multi-modal density Estimator), a non-parametric approach for density estimation which addresses the challenge of estimating multi-modal, non-normal, and highly correlated distributions. ROME utilizes clustering to segment a multi-modal set of samples into multiple uni-modal ones and then combines simple KDE estimates obtained for individual clusters in a single multi-modal estimate. We compared our approach to state-of-the-art methods for density estimation as well as ablations of ROME, showing that it not only outperforms established methods but is also more robust to a variety of distributions. Our results demonstrate that ROME can overcome the issues of over-fitting and over-smoothing exhibited by other estimators, promising a more robust evaluation of probabilistic machine learning models.
翻译:多模态概率预测模型的发展催生了对综合评估指标的需求。尽管负对数似然、Jensen-Shannon散度等指标可表征机器学习模型的准确性,但这些指标通常需要作用于概率密度。因此,将其应用于纯样本预测模型时需对底层密度函数进行估计。然而,核密度估计(KDE)等常见方法已被证实缺乏鲁棒性,而更复杂的方法尚未在多模态估计问题中得到系统评估。本文提出ROME(鲁棒多模态密度估计器),一种非参数密度估计方法,专门应对多模态、非正态及高相关分布估计的挑战。ROME通过聚类将多模态样本集分割为多个单模态子集,再对各聚类子集进行简单KDE估计后整合为统一的多模态密度估计。我们与现有最优密度估计方法及ROME消融实验进行对比,结果表明该方法不仅优于既有方法,且对不同分布具有更强鲁棒性。实验证明,ROME能够克服其他估计器存在的过拟合与过度平滑问题,为概率机器学习模型的鲁棒评估提供了新方案。