Hierarchical factor models, which include the bifactor model as a special case, are useful in social and behavioural sciences for measuring hierarchically structured constructs. Specifying a hierarchical factor model involves imposing hierarchically structured zero constraints on a factor loading matrix, which is often challenging. Therefore, an exploratory analysis is needed to learn the hierarchical factor structure from data. Unfortunately, there does not exist an identifiability theory for the learnability of this hierarchical structure, nor a computationally efficient method with provable performance. The method of Schmid-Leiman transformation, which is often regarded as the default method for exploratory hierarchical factor analysis, is flawed and likely to fail. The contribution of this paper is three-fold. First, an identifiability result is established for general hierarchical factor models, which shows that the hierarchical factor structure is learnable under mild regularity conditions. Second, a computationally efficient divide-and-conquer approach is proposed for learning the hierarchical factor structure. Finally, asymptotic theory is established for the proposed method, showing that it can consistently recover the true hierarchical factor structure as the sample size grows to infinity. The power of the proposed method is shown via simulation studies and a real data application to a personality test. The computation code for the proposed method is publicly available at https://github.com/EmetSelch97/EHFA/.
翻译:分层因子模型(包含双因子模型作为特例)在社会科学与行为科学中,对于测量具有层次结构的构念具有重要价值。构建分层因子模型需要在因子载荷矩阵上施加层次化的零约束,这一过程通常具有挑战性。因此,需要通过探索性分析从数据中学习分层因子结构。遗憾的是,目前既缺乏关于该层次结构可学习性的可识别性理论,也缺少具有可证明性能的高效计算方法。常被视为探索性分层因子分析默认方法的施密德-莱曼变换法存在缺陷,且很可能失效。本文的贡献主要体现在三个方面:首先,为一般分层因子模型建立了可识别性结果,证明在温和的正则性条件下,分层因子结构是可学习的;其次,提出了一种计算高效的"分而治之"方法来学习分层因子结构;最后,为所提方法建立了渐近理论,证明当样本量趋于无穷时,该方法能够一致地恢复真实的分层因子结构。通过模拟研究和一项人格测验的实际数据应用,验证了所提方法的有效性。该方法的计算代码已在 https://github.com/EmetSelch97/EHFA/ 公开。