We study nonparametric regression and classification for path-valued data. We introduce a functional Nadaraya-Watson estimator that combines the signature transform from rough path theory with local kernel regression. The signature transform provides a principled way to encode sequential data through iterated integrals, enabling direct comparison of paths in a natural metric space. Our approach leverages signature-induced distances within the classical kernel regression framework, achieving computational efficiency while avoiding the scalability bottlenecks of large-scale kernel matrix operations. We establish finite-sample convergence bounds demonstrating favorable statistical properties of signature-based distances compared to traditional metrics in infinite-dimensional settings. We propose robust signature variants that provide stability against outliers, enhancing practical performance. Applications to both synthetic and real-world data - including stochastic differential equation learning and time series classification - demonstrate competitive accuracy while offering significant computational advantages over existing methods.
翻译:本研究探讨路径值数据的非参数回归与分类问题。我们提出了一种结合粗糙路径理论中签名变换与局部核回归的函数型Nadaraya-Watson估计器。签名变换通过迭代积分为序列数据提供了一种原则性的编码方式,使得路径能在自然度量空间中直接比较。该方法将签名诱导的距离嵌入经典核回归框架,在保持计算效率的同时避免了大规模核矩阵运算的可扩展性瓶颈。我们建立了有限样本收敛界,证明在无限维场景下基于签名的距离相较于传统度量具有更优的统计特性。通过提出稳健的签名变体增强了对异常值的稳定性,从而提升了实际性能。在合成数据与真实数据(包括随机微分方程学习和时间序列分类)上的应用表明,该方法在保持竞争力的准确率同时,相比现有方法具有显著的计算优势。