The dispersion of real data is particularly important to understand the variability of a given distribution. In addition to the central tendency, variability is of considerable interest in a wide variety of fields such as life sciences, meteorology, and economics. The modal interval (MI) describes the dispersion or spread of distribution and represents the most concentrated interval of a univariate unimodal distribution. In this study, we propose a nonlinear modal interval regression (MIR) method to smoothly estimate a conditional MI to provide a robust description of how the dispersion of a data distribution varies with the covariate. First, we use kernel density estimation (KDE) to estimate the quantile levels corresponding to the conditional MI bounds, which serve as input to the quantile loss function. Second, we fit upper and lower bound functions using the quantile loss with smoothing splines. The results of numerical experiments demonstrate that the reformulated MIR achieved higher accuracy and stability than both the conventional MIR and the KDE methods. To evaluate the effectiveness of the proposed approach, we applied the method to neonatal hormone data and identified notable rhythms in cortisol and melatonin levels during the first ten days after birth.
翻译:真实数据的离散度对于理解给定分布的变异性尤为重要。除集中趋势外,变异性在生命科学、气象学和经济学等诸多领域均具有重要研究价值。模态区间(MI)描述了分布的离散或扩散程度,代表了单变量单峰分布中最集中的区间。本研究提出了一种非线性模态区间回归(MIR)方法,通过平滑估计条件MI来稳健描述数据分布离散度如何随协变量变化。首先,我们采用核密度估计(KDE)计算对应于条件MI边界的分位数水平,并将其作为分位数损失函数的输入。其次,我们使用平滑样条结合分位数损失拟合上下边界函数。数值实验结果表明,重构后的MIR方法相较于传统MIR和KDE方法具有更高的准确性与稳定性。为评估所提方法的有效性,我们将其应用于新生儿激素数据,成功识别出出生后十天内皮质醇和褪黑素水平的显著节律特征。