We propose a hidden Markov model for univariate proportion time series taking values in (0,1), where regime switching captures latent structural changes and the emission distribution belongs to the Beta family. In each latent state, the Beta mean is linked to covariates through a generalized additive model (GAM) with spline-based smooth functions, while the Beta precision is state-specific, enabling flexible modeling of both nonlinear covariate effects and regime-dependent variability. Estimation is carried out via a penalized expectation--maximization algorithm, combining smoothing with numerical maximization of the penalized emission likelihood. To select the number of latent states and the smoothing penalty, we implement a grid search guided by standard information criteria (Akaike Information Criterion/Bayesian Information Criterion/Integrated Completed Likelihood) with a diagnostic filter that removes degenerate solutions characterized by explosive precision estimates. Uncertainty is quantified through a parametric bootstrap procedure for transition probabilities and state-dependent parameters. Simulation results demonstrate accurate recovery of transition dynamics, state precisions, and latent-state decoding. A motivating application to Russian age-specific mortality data (1960--2014, ages 0--40) illustrates how the proposed model summarizes smooth age patterns in female-to-total mortality ratios while identifying two persistent latent regimes that admit a substantive demographic interpretation in light of the country's well-documented mortality shocks that occurred over the second half of the twentieth century.
翻译:我们提出了一种隐马尔可夫模型,用于处理取值为(0,1)的单变量比例时间序列,其中状态切换捕捉潜在的结枸变化,而发射分布属于Beta分布族。在每个潜在状态下,Beta均值通过基于样条平滑函数的广义加性模型与协变量相关联,而Beta精度是状态特定的,从而能够灵活建模非线性协变量效应和状态依赖的变异性。模型估计通过带惩罚的期望最大化算法实现,结合了平滑与惩罚发射似然的数值最大化。为选择潜在状态数量和平滑惩罚参数,我们采用基于标准信息准则(赤池信息准则/贝叶斯信息准则/综合完全似然)的网格搜索,并引入诊断过滤器以剔除由爆炸性精度估计导致的退化解。通过参数自助法量化转移概率和状态依赖参数的不确定性。模拟结果表明,该模型能准确恢复转移动态、状态精度及潜在状态解码。将模型应用于俄罗斯年龄特定死亡率数据(1960—2014年,年龄0-40岁)的动机性案例表明,该模型在总结女性与总死亡率比值的平滑年龄模式的同时,识别出两个持续性潜在状态,这些状态在20世纪下半叶该国记录在案的人口死亡率冲击背景下具有实质性的人口学解释意义。