Researchers in the behavioral and social sciences use linear discriminant analysis (LDA) for predictions of group membership (classification) and for identifying the variables most relevant to group separation among a set of continuous correlated variables (description). \\ In these and other disciplines, longitudinal data are often collected which provide additional temporal information. Linear classification methods for repeated measures data are more sensitive to actual group differences by taking the complex correlations between time points and variables into account, but are rarely discussed in the literature. Moreover, psychometric data rarely fulfill the multivariate normality assumption.\\ In this paper, we compare existing linear classification algorithms for nonnormally distributed multivariate repeated measures data in a simulation study based on psychological questionnaire data comprising Likert scales. The results show that in data without any specific assumed structure and larger sample sizes, the robust alternatives to standard repeated measures LDA may not be needed. To our knowledge, this is one of the few studies discussing repeated measures classification techniques, and the first one comparing multiple alternatives among each other.
翻译:行为与社会科学领域的研究者常采用线性判别分析(LDA)进行组别归属预测(分类)以及在一组连续相关变量中识别对组间区分最为关键的变量(描述)。在这些及其他学科中,纵向数据的收集往往能提供额外的时序信息。针对重复测量数据的线性分类方法通过综合考虑时间点与变量间的复杂相关性,对实际组间差异具有更高的敏感性,但相关文献中鲜有讨论。此外,心理测量数据很少满足多元正态性假设。本文基于包含李克特量表的心理学问卷数据,通过仿真研究比较了现有针对非正态分布多元重复测量数据的线性分类算法。结果表明,在无特定假设结构且样本量较大的数据中,可能无需采用标准重复测量LDA的稳健替代方法。据我们所知,这是少数探讨重复测量分类技术的研究之一,也是首次对多种替代方法进行相互比较的研究。