While advances continue to be made in model-based clustering, challenges persist in modeling various data types such as panel data. Multivariate panel data present difficulties for clustering algorithms due to the unique correlation structure, a consequence of taking observations on several subjects over multiple time points. Additionally, panel data are often plagued by missing data and dropouts, presenting issues for estimation algorithms. This research presents a family of hidden Markov models that compensate for the unique correlation structures that arise in panel data. A modified expectation-maximization algorithm capable of handling missing not at random data and dropout is presented and used to perform model estimation.
翻译:尽管基于模型的聚类方法不断取得进展,但在处理面板数据等多样数据类型时仍面临挑战。由于需在多个时间点对多个研究对象进行观测,多元面板数据存在的独特相关结构给聚类算法带来困难。此外,面板数据常受缺失数据与样本流失的困扰,这为估计算法提出了新问题。本研究提出了一类能够补偿面板数据中特殊相关结构的隐马尔可夫模型。通过改进期望最大化算法,使其能够处理非随机缺失数据与样本流失问题,并将其用于模型参数估计。