Signature transforms are iterated path integrals of continuous and discrete-time time series data, and their universal nonlinearity linearizes the problem of feature selection. This paper revisits the consistency issue of Lasso regression for the signature transform, both theoretically and numerically. Our study shows that, for processes and time series that are closer to Brownian motion or random walk with weaker inter-dimensional correlations, the Lasso regression is more consistent for their signatures defined by It\^o integrals; for mean reverting processes and time series, their signatures defined by Stratonovich integrals have more consistency in the Lasso regression. Our findings highlight the importance of choosing appropriate definitions of signatures and stochastic models in statistical inference and machine learning.
翻译:签名变换是连续和离散时间序列数据的迭代路径积分,其通用非线性特性将特征选择问题线性化。本文从理论和数值两方面重新审视了Lasso回归在签名变换中的一致性问题。研究表明,对于更接近布朗运动或随机游走且跨维度相关性较弱的过程及时间序列,其由伊藤积分定义的签名在Lasso回归中具有更高的一致性;而对于均值回归过程及时间序列,由斯特拉托诺维奇积分定义的签名在Lasso回归中表现出更强的一致性。我们的发现凸显了在统计推断与机器学习中选择合适签名定义及随机模型的重要性。