Functional data contains two components: shape (or amplitude) and phase. This paper focuses on a branch of functional data analysis (FDA), namely Shape-Based FDA, that isolates and focuses on shapes of functions. Specifically, this paper focuses on Scalar-on-Shape (ScoSh) regression models that incorporate the shapes of predictor functions and discard their phases. This aspect sets ScoSh models apart from the traditional Scalar-on-Function (ScoF) regression models that incorporate full predictor functions. ScoSh is motivated by object data analysis, {\it, e.g.}, for neuro-anatomical objects, where object morphologies are relevant and their parameterizations are arbitrary. ScoSh also differs from methods that arbitrarily pre-register data and uses it in subsequent analysis. In contrast, ScoSh models perform registration during regression, using the (non-parametric) Fisher-Rao inner product and nonlinear index functions to capture complex predictor-response relationships. This formulation results in novel concepts of {\it regression phase} and {\it regression mean} of functions. Regression phases are time-warpings of predictor functions that optimize prediction errors, and regression means are optimal regression coefficients. We demonstrate practical applications of the ScoSh model using extensive simulated and real-data examples, including predicting COVID outcomes when daily rate curves are predictors.
翻译:函数数据包含两个组成部分:形状(或振幅)与相位。本文聚焦于函数数据分析(FDA)的一个分支,即基于形状的FDA,该分支将函数的形状分离并作为分析重点。具体而言,本文研究标量-形状(ScoSh)回归模型,该模型纳入预测函数的形状而忽略其相位。这一特性使ScoSh模型有别于传统的标量-函数(ScoF)回归模型,后者使用完整的预测函数。ScoSh的提出受到对象数据分析的启发,例如在神经解剖对象分析中,对象形态具有相关性而其参数化具有任意性。ScoSh也不同于那些先对数据进行任意预配准再用于后续分析的方法。与之相反,ScoSh模型在回归过程中同步完成配准,利用(非参数)Fisher-Rao内积与非线性指标函数来捕捉复杂的预测变量-响应关系。这一框架催生了函数“回归相位”与“回归均值”的新概念。回归相位是通过时间扭曲预测函数以优化预测误差的结果,而回归均值则是最优回归系数。我们通过大量仿真与真实数据案例展示了ScoSh模型的实际应用,包括以每日感染率曲线为预测变量来预测COVID临床结局。