Rough path theory provides one with the notion of signature, a graded family of tensors which characterise, up to a negligible equivalence class, and ordered stream of vector-valued data. In the last few years, use of the signature has gained traction in time-series analysis, machine learning , deep learning and more recently in kernel methods. In this article, we lay down the theoretical foundations for a connection between signature asymptotics, the theory of empirical processes, and Wasserstein distances, opening up the landscape and toolkit of the second and third in the study of the first. Our main contribution is to show that the Hambly-Lyons limit can be reinterpreted as a statement about the asymptotic behaviour of Wasserstein distances between two independent empirical measures of samples from the same underlying distribution. In the setting studied here, these measures are derived from samples from a probability distribution which is determined by geometrical properties of the underlying path. The general question of rates of convergence for these objects has been studied in depth in the recent monograph of Bobkov and Ledoux. By using these results, we generalise the original result of Hambly and Lyons from $C^3$ curves to a broad class of $C^2$ ones. We conclude by providing an explicit way to compute the limit in terms of a second-order differential equation.
翻译:粗路径理论提供了签名的概念——一种分阶张量族,可刻画向量值数据的有序流(直至可忽略的等价类)。近些年来,签名在时间序列分析、机器学习、深度学习以及近期的核方法中得到了广泛应用。本文为签名渐近性、经验过程理论与Wasserstein距离之间的关联奠定了理论基础,从而将后两者的研究工具与视角引入前者的研究领域。我们的主要贡献在于证明Hambly-Lyons极限可被重新解释为:来自同一基础分布的两个独立经验测度之间Wasserstein距离的渐近行为。在此研究框架下,这些测度源于由路径几何特性确定的概率分布样本。关于这些对象收敛速率的一般性问题已在Bobkov与Ledoux近期专著中得到深入探讨。基于这些结果,我们将Hambly与Lyons的原始结论从$C^3$曲线推广至更广泛的$C^2$曲线类。最后,我们通过二阶微分方程给出了计算该极限的显式方法。