ProfileGLMM is an R package integrating Generalised Linear Mixed Models (GLMMs) as the outcome model for Bayesian profile regression. This statistical framework simultaneously i) explains the variation in the outcome and ii) clusters the observations based on a specified set of interdependent clustering covariates. The derived cluster memberships are then incorporated, alongside others, as explanatory variables in the regression to model the outcome. This framework efficiently handles complex, highly correlated covariate structures whose direct inclusion in a standard regression model would be statistically sub-optimal. ProfileGLMM significantly extends Bayesian profile regression's scope by resolving two key constraints of previous implementations: 1) it allows the analysis of hierarchical and longitudinal data structures through the inclusion of random effects, and 2) it enables the study of interactions between latent clusters and other observable covariates. ProfileGLMM accommodates various data types, supporting both continuous or binary outcomes and both categorical and continuous clustering covariates. Built on fast Rcpp code with minimal mandatory parameters, ProfileGLMM offers a flexible analytical tool. It significantly enhances the utility of profile regression for researchers in fields such as epidemiology, social sciences, and clinical studies dealing with complex data.
翻译:ProfileGLMM是一个将广义线性混合模型(GLMMs)作为贝叶斯剖面回归结果模型的R包。该统计框架能同时实现:i)解释结果变量的变异,以及ii)基于一组指定的、相互依赖的聚类协变量对观测进行聚类。由此产生的聚类成员关系将作为解释变量与其他协变量一同被纳入回归模型中来对结果进行建模。该框架能够高效处理复杂且高度相关的协变量结构,而若将其直接包含在标准回归模型中在统计上将是次优的。ProfileGLMM通过解决先前实现的两个关键限制,显著扩展了贝叶斯剖面回归的应用范围:1)通过纳入随机效应,使得分层和纵向数据结构的分析成为可能;2)能够研究潜在聚类与其他可观测协变量之间的交互作用。ProfileGLMM适应多种数据类型,支持连续或二元结果变量,以及分类和连续的聚类协变量。该软件基于快速的Rcpp代码构建,必选参数极少,提供了灵活的分析工具,显著增强了剖面回归在流行病学、社会科学以及处理复杂数据的临床研究等领域研究人员中的实用性。