Probability density functions form a specific class of functional data objects with intrinsic properties of scale invariance and relative scale characterized by the unit integral constraint. The Bayes spaces methodology respects their specific nature, and the centred log-ratio transformation enables processing such functional data in the standard Lebesgue space of square-integrable functions. As the data representing densities are frequently observed in their discrete form, the focus has been on their spline representation. Therefore, the crucial step in the approximation is to construct a proper spline basis reflecting their specific properties. Since the centred log-ratio transformation forms a subspace of functions with a zero integral constraint, the standard $B$-spline basis is no longer suitable. Recently, a new spline basis incorporating this zero integral property, called $Z\!B$-splines, was developed. However, this basis does not possess the orthogonal property which is beneficial from computational and application point of view. As a result of this paper, we describe an efficient method for constructing an orthogonal $Z\!B$-splines basis, called $Z\!B$-splinets. The advantages of the $Z\!B$-splinet approach are foremost a computational efficiency and locality of basis supports that is desirable for data interpretability, e.g. in the context of functional principal component analysis. The proposed approach is demonstrated on an empirical demographic dataset.
翻译:概率密度函数构成一类特定的函数型数据对象,其固有属性包括尺度不变性以及由单位积分约束所表征的相对尺度。贝叶斯空间方法尊重其特殊性质,而对数中心化变换使得此类函数型数据能够在标准平方可积函数的勒贝格空间中处理。由于表征密度的数据常以离散形式观测,其样条表示成为研究重点。因此,构造满足密度特殊性质的合适样条基是逼近过程中的关键步骤。鉴于对数中心化变换形成满足零积分约束的函数子空间,标准$B$-样条基不再适用。近期,一种具有零积分特性的新型样条基——$Z\!B$-样条被提出,但该基函数不具备从计算与应用角度颇具优势的正交性。作为本文成果,我们描述了一种构建正交$Z\!B$-样条基(称为$Z\!B$-样条系)的高效方法。$Z\!B$-样条系方法的主要优势在于其计算高效性以及基函数支撑集的局部性——这在功能主成分分析等场景中有利于数据可解释性。本文通过实证人口数据集展示了所提方法的有效性。