The number of modes in a probability density function is representative of the model's complexity and can also be viewed as the number of existing subpopulations. Despite its relevance, little research has been devoted to its estimation. Focusing on the univariate setting, we propose a novel approach targeting prediction accuracy inspired by some overlooked aspects of the problem. We argue for the need for structure in the solutions, the subjective and uncertain nature of modes, and the convenience of a holistic view blending global and local density properties. Our method builds upon a combination of flexible kernel estimators and parsimonious compositional splines. Feature exploration, model selection and mode testing are implemented in the Bayesian inference paradigm, providing soft solutions and allowing to incorporate expert judgement in the process. The usefulness of our proposal is illustrated through a case study in sports analytics, showcasing multiple companion visualisation tools. A thorough simulation study demonstrates that traditional modality-driven approaches paradoxically struggle to provide accurate results. In this context, our method emerges as a top-tier alternative offering innovative solutions for analysts.
翻译:概率密度函数中的模态数目代表了模型的复杂度,也可视为存在的子群体数量。尽管这一问题具有重要性,但相关估计研究仍较为匮乏。本文聚焦单变量场景,从问题中常被忽视的某些视角出发,提出了一种以预测精度为导向的新型方法。我们论证了以下必要性:解的结构性需求、模态的主观性与不确定性特征,以及融合全局与局部密度特性的整体视角。所构建方法基于灵活核估计量与简约组合样条的结合。特征探索、模型选择与模态检验均在贝叶斯推断框架下实现,提供软性解决方案并允许在过程中融入专家判断。通过一项体育分析案例研究,展示了多种配套可视化工具,验证了该方法的实用性。一项全面的仿真研究表明,传统模态驱动方法在提供精确结果方面反而存在困难。在此背景下,我们的方法成为为分析人员提供创新解决方案的一流替代方案。