Path signatures have been proposed as a powerful representation of paths that efficiently captures the path's analytic and geometric characteristics, having useful algebraic properties including fast concatenation of paths through tensor products. Signatures have recently been widely adopted in machine learning problems for time series analysis. In this work we establish connections between value functions typically used in optimal control and intriguing properties of path signatures. These connections motivate our novel control framework with signature transforms that efficiently generalizes the Bellman equation to the space of trajectories. We analyze the properties and advantages of the framework, termed signature control. In particular, we demonstrate that (i) it can naturally deal with varying/adaptive time steps; (ii) it propagates higher-level information more efficiently than value function updates; (iii) it is robust to dynamical system misspecification over long rollouts. As a specific case of our framework, we devise a model predictive control method for path tracking. This method generalizes integral control, being suitable for problems with unknown disturbances. The proposed algorithms are tested in simulation, with differentiable physics models including typical control and robotics tasks such as point-mass, curve following for an ant model, and a robotic manipulator.
翻译:路径签名被提出作为路径的一种强大表示,能够高效捕捉路径的分析与几何特征,并具有通过张量积实现路径快速连接等有用的代数性质。近期,签名在时间序列分析的机器学习问题中得到广泛应用。本文建立了最优控制中常用的值函数与路径签名独特性质之间的关联。这些关联启发我们提出了一个基于签名变换的新型控制框架,该框架将贝尔曼方程高效泛化至轨迹空间。我们分析了该框架(称为签名控制)的性质与优势,特别证明了:(i)它能自然处理可变/自适应时间步长;(ii)与值函数更新相比,它能更高效地传播高层信息;(iii)在长期展开中对动力学系统错误设定具有鲁棒性。作为该框架的具体实例,我们设计了一种用于路径跟踪的模型预测控制方法。该方法泛化了积分控制,适用于存在未知扰动的场景。所提算法在仿真环境中进行测试,使用了可微物理模型,涵盖典型控制与机器人任务,如质点运动、蚂蚁模型的曲线跟踪以及机械臂操控。