Multivariate functional data present theoretical and practical complications which are not found in univariate functional data. One of these is a situation where the component functions of multivariate functional data are positive and are subject to mutual time warping. That is, the component processes exhibit a common shape but are subject to systematic phase variation across their domains in addition to subject-specific time warping, where each subject has its own internal clock. This motivates a novel model for multivariate functional data that connects such mutual time warping to a latent deformation-based framework by exploiting a novel time warping separability assumption. This separability assumption allows for meaningful interpretation and dimension reduction. The resulting Latent Deformation Model is shown to be well suited to represent commonly encountered functional vector data. The proposed approach combines a random amplitude factor for each component with population based registration across the components of a multivariate functional data vector and includes a latent population function, which corresponds to a common underlying trajectory. We propose estimators for all components of the model, enabling implementation of the proposed data-based representation for multivariate functional data and downstream analyses such as Fr\'echet regression. Rates of convergence are established when curves are fully observed or observed with measurement error. The usefulness of the model, interpretations, and practical aspects are illustrated in simulations and with application to multivariate human growth curves and multivariate environmental pollution data.
翻译:多元函数型数据存在单变量函数型数据中未发现的理论与实践复杂性。其中一个场景是多元函数型数据的分量函数具有正值并受互逆时间扭曲影响。即各分量过程呈现共同形态,但除个体特异时间扭曲(每个对象具有自身内部时钟)外,还受到跨域系统性相位变异的影响。这催生了一种新型多元函数型数据模型,通过引入创新的时间扭曲可分离性假设,将此类互逆时间扭曲与基于潜在变形的框架相关联。该可分离性假设允许有意义的解释与降维。所提出的潜在变形模型被证明能有效表示常见的函数型向量数据。该方法为每个分量结合随机振幅因子与多元函数型数据向量各分量间的总体配准,并包含对应于共同潜在轨迹的潜群体函数。我们为模型所有分量提出估计量,从而实现基于数据的多元函数型数据表示及下游分析(如Fréchet回归)。当曲线完全观测或存在测量误差时,建立了收敛速率。通过模拟实验以及对多元人类生长曲线与多元环境污染数据的应用,验证了模型的有效性、解释性及实用价值。