Polychoric correlation is often an important building block in the analysis of rating data, particularly for structural equation models. However, the commonly employed maximum likelihood (ML) estimator is highly susceptible to misspecification of the polychoric correlation model, for instance through violations of latent normality assumptions. We propose a novel estimator that is designed to be robust to partial misspecification of the polychoric model, that is, the model is only misspecified for an unknown fraction of observations, for instance (but not limited to) careless respondents. In contrast to existing literature, our estimator makes no assumption on the type or degree of model misspecification. It furthermore generalizes ML estimation, is consistent as well as asymptotically normally distributed, and comes at no additional computational cost. We demonstrate the robustness and practical usefulness of our estimator in simulation studies and an empirical application on a Big Five administration. In the latter, the polychoric correlation estimates of our estimator and ML differ substantially, which, after further inspection, is likely due to the presence of careless respondents that the estimator helps identify.
翻译:多分格相关系数通常是评级数据分析中的重要构建模块,特别是在结构方程模型中。然而,常用的最大似然估计量极易受到多分格相关模型设定错误的影响,例如违反潜变量正态性假设。我们提出了一种新颖的估计量,旨在对多分格模型的部分设定错误具有稳健性,即模型仅对未知比例的观测值(例如但不限于粗心的受访者)存在设定错误。与现有文献相比,我们的估计量不对模型设定错误的类型或程度做任何假设。此外,它推广了最大似然估计,具有一致性且渐近正态分布,并且不增加额外的计算成本。我们通过模拟研究和一个关于大五人格测验的实证应用,证明了我们估计量的稳健性和实际效用。在后者中,我们的估计量与最大似然估计得出的多分格相关系数估计值存在显著差异,经进一步检查,这很可能是由于存在粗心的受访者,而我们的估计量有助于识别这类受访者。