The number of modes in a probability density function is representative of the complexity of a model and can also be viewed as the number of subpopulations. Despite its relevance, there has been limited research in this area. A novel approach to estimating the number of modes in the univariate setting is presented, focusing on prediction accuracy and inspired by some overlooked aspects of the problem: the need for structure in the solutions, the subjective and uncertain nature of modes, and the convenience of a holistic view that blends local and global density properties. The technique combines flexible kernel estimators and parsimonious compositional splines in the Bayesian inference paradigm, providing soft solutions and incorporating expert judgment. The procedure includes feature exploration, model selection, and mode testing, illustrated in a sports analytics case study showcasing multiple companion visualisation tools. A thorough simulation study also demonstrates that traditional modality-driven approaches paradoxically struggle to provide accurate results. In this context, the new method emerges as a top-tier alternative, offering innovative solutions for analysts.
翻译:概率密度函数的模态数量代表了模型的复杂度,也可视为子群体的数量。尽管其具有重要价值,但相关研究仍较为有限。本文提出一种单变量场景下模态数量估计的新方法,聚焦于预测准确性,并受该问题中一些常被忽视的方面启发:解需要结构性的需求、模态的主观性和不确定性,以及融合局部与全局密度特性的整体视角的便利性。该技术将灵活核估计器与简约组合样条融入贝叶斯推断范式,提供软性解并纳入专家判断。该流程包括特征探索、模型选择与模态检验,并通过体育分析案例展示了多种可视化工具。一项全面的模拟研究还表明,传统模态驱动方法反而难以提供准确结果。在此背景下,新方法作为顶级替代方案脱颖而出,为分析师提供了创新解决方案。