We propose a new auto-regressive model for the statistical analysis of multivariate distributional time series. The data of interest consist of a collection of multiple series of probability measures supported over a bounded interval of the real line, and that are indexed by distinct time instants. The probability measures are modelled as random objects in the Wasserstein space. We establish the auto-regressive model in the tangent space at the Lebesgue measure by first centering all the raw measures so that their Fr\'echet means turn to be the Lebesgue measure. Using the theory of iterated random function systems, results on the existence, uniqueness and stationarity of the solution of such a model are provided. We also propose a consistent estimator for the model coefficient. In addition to the analysis of simulated data, the proposed model is illustrated with two real data sets made of observations from age distribution in different countries and bike sharing network in Paris. Finally, due to the positive and boundedness constraints that we impose on the model coefficients, the proposed estimator that is learned under these constraints, naturally has a sparse structure. The sparsity allows furthermore the application of the proposed model in learning a graph of temporal dependency from the multivariate distributional time series.
翻译:我们提出了一种新的自回归模型,用于多元分布时间序列的统计分析。所关注的数据由一组定义在实轴有界区间上的概率测度多元序列构成,每个序列对应不同的时间点。这些概率测度被建模为Wasserstein空间中的随机对象。我们以勒贝格测度为切空间原点建立自回归模型,首先对所有原始测度进行中心化处理,使其弗雷歇均值转化为勒贝格测度。基于迭代随机函数系统理论,我们证明了该模型解的存在性、唯一性及平稳性条件,并提出了模型系数的一致估计量。除模拟数据分析外,我们通过两个真实数据集(不同国家年龄分布观测数据与巴黎共享单车网络数据)检验了所提模型。最后,由于对模型系数施加了正性约束与有界性约束,在此约束下学习的估计量自然具备稀疏结构。这种稀疏性进一步使得所提模型能够应用于从多元分布时间序列中学习时间依赖关系图。