We propose a flexible regression framework to model the conditional distribution of multilevel generalized multivariate functional data of potentially mixed type, e.g. binary and continuous data. We make pointwise parametric distributional assumptions for each dimension of the multivariate functional data and model each distributional parameter as an additive function of covariates. The dependency between the different outcomes and, for multilevel functional data, also between different functions within a level is modelled by shared latent multivariate Gaussian processes. For a parsimonious representation of the latent processes, (generalized) multivariate functional principal components are estimated from the data and used as an empirical basis for these latent processes in the regression framework. Our modular two-step approach is very general and can easily incorporate new developments in the estimation of functional principal components for all types of (generalized) functional data. Flexible additive covariate effects for scalar or even functional covariates are available and are estimated in a Bayesian framework. We provide an easy-to-use implementation in the accompanying R package 'gmfamm' on CRAN and conduct a simulation study to confirm the validity of our regression framework and estimation strategy. The proposed multivariate functional model is applied to four dimensional traffic data in Berlin, which consists of the hourly numbers and mean speed of cars and trucks at different locations.
翻译:我们提出了一种灵活的回归框架,用于对具有潜在混合类型(例如二元与连续数据)的多层次广义多元函数数据的条件分布进行建模。我们对多元函数数据的每个维度作出逐点的参数分布假设,并将每个分布参数建模为协变量的可加函数。不同结果之间的依赖关系,以及对于多层次函数数据而言同一层次内不同函数之间的依赖关系,均通过共享的潜在多元高斯过程进行建模。为了实现对潜在过程的简约表示,我们从数据中估计(广义)多元函数主成分,并将其作为回归框架中这些潜在过程的经验基。我们的模块化两步法非常通用,能够轻松纳入针对所有类型(广义)函数数据的主成分估计方法的新进展。该框架支持对标量乃至函数型协变量的灵活可加效应,并在贝叶斯框架下进行估计。我们在随附的CRAN R包'gmfamm'中提供了易于使用的实现,并通过模拟研究验证了我们回归框架与估计策略的有效性。所提出的多元函数模型被应用于柏林的四维交通数据,该数据包含不同地点每小时汽车与卡车的数量及平均速度。