Functional data consist of trajectories observed over a continuous domain, such as time, space, or wavelength. Here we consider curves observed on different groups of subjects and propose a Bayesian multi-group functional factor analysis framework that jointly models the data via an explicit decomposition into group-specific mean functions and latent components that capture both common and distinct latent structures across the groups. We represent these functional components as linear combinations of a common set of B-spline bases, achieving a low-rank representation of the latent factors. We further impose a parameter-expanded cumulative shrinkage process prior on the factor loadings, which induces increasing shrinkage and automatically selects the number of active shared and group-specific factors. We evaluate the model's performance through simulation studies and show that the model accurately recovers the number of underlying factors and effectively distinguishes variations in functional observations driven by shared versus group-specific complex structures under various scenarios. For real data analysis, we apply the model to EEG data on alcoholic and healthy subjects and identify shared latent factors, that capture canonical characteristic components of the EEG curves, along with group-specific factors that reveal specific neural activity patterns.
翻译:摘要:函数型数据由在连续域(如时间、空间或波长)上观测的轨迹构成。本文针对不同组别受试者观测到的曲线,提出了一种贝叶斯多组函数因子分析框架,该框架通过显式分解为组均值函数和捕捉组间共有与独特潜在结构的隐成分,对数据进行联合建模。我们将这些函数成分表示为公共B样条基的线性组合,实现潜在因子的低秩表示。进一步,我们在因子载荷上施加参数扩展累积收缩过程先验,该先验会诱导递增的收缩效应,并自动选择活跃的共享因子与组特定因子数量。通过模拟研究评估模型性能,结果表明该模型能准确恢复潜在因子数量,并在多种场景下有效区分由共享结构或组特异复杂结构驱动的函数型观测变异。在真实数据分析中,我们将模型应用于酗酒者与健康受试者的脑电图数据,识别出捕捉脑电图曲线典型特征的共享潜在因子,以及揭示特定神经活动模式的组特异性因子。