In this study, we develop a latent factor model for analysing high-dimensional binary data. Specifically, a standard probit model is used to describe the regression relationship between the observed binary data and the continuous latent variables. Our method assumes that the dependency structure of the observed binary data can be fully captured by the continuous latent factors. To estimate the model, a moment-based estimation method is developed. The proposed method is able to deal with both discontinuity and high dimensionality. Most importantly, the asymptotic properties of the resulting estimators are rigorously established. Extensive simulation studies are presented to demonstrate the proposed methodology. A real dataset about product descriptions is analysed for illustration.
翻译:本研究提出了一种用于分析高维二元数据的潜变量因子模型。具体而言,采用标准probit模型描述观测二元数据与连续潜变量之间的回归关系。该方法假定观测二元数据的依赖结构可完全由连续潜因子捕捉。为估计模型参数,我们发展了一种基于矩的估计方法。该方法能够同时处理数据的离散性和高维性。更为重要的是,我们严格建立了所得估计量的渐近性质。通过大量模拟研究验证了所提方法的有效性,并利用产品描述的真实数据集进行了实例分析。