Correlated data are ubiquitous in today's data-driven society. While regression models for analyzing means and variances of responses of interest are relatively well-developed, the development of these models for analyzing the correlations is largely confined to longitudinal data, a special form of sequentially correlated data. This paper proposes a new method for the analysis of correlations to fully exploit the use of covariates for general correlated data. In a renewed analysis of the Classroom data, a highly unbalanced multilevel clustered data with within-class and within-school correlations, our method reveals informative insights on these structures not previously known. In another analysis of the malaria immune response data in Benin, a longitudinal study with time-dependent covariates where the exact times of the observations are not available, our approach again provides promising new results. At the heart of our approach is a new generalized z-transformation that converts correlation matrices constrained to be positive definite to vectors with unrestricted support, and is order-invariant. These two properties enable us to develop regression analysis incorporating covariates for the modelling of correlations via the use of maximum likelihood.
翻译:相关数据在当今数据驱动的社会中无处不在。虽然用于分析感兴趣响应均值和方差的回归模型已经相对成熟,但这些模型在分析相关性方面的发展主要局限于纵向数据这种特殊的序列相关数据形式。本文提出了一种新的相关性分析方法,以充分利用协变量对一般相关数据进行建模。在对课堂数据的重新分析中,这是一种高度不平衡的多层次聚类数据,包含班级内和学校内相关性,我们的方法揭示了这些结构之前未知的有价值见解。在对贝宁疟疾免疫反应数据的另一分析中,这是一项具有时变协变量且观测具体时间不可用的纵向研究,我们的方法再次提供了有前景的新结果。我们方法的核心是一种新的广义z变换,它将受限于正定性的相关矩阵转化为具有无约束支撑的向量,并且具有顺序不变性。这两个特性使我们能够通过最大似然法开发结合协变量的相关性回归分析。