Accurate disturbance estimation is essential for safe robot operations. The recently proposed neural moving horizon estimation (NeuroMHE), which uses a portable neural network to model the MHE's weightings, has shown promise in further pushing the accuracy and efficiency boundary. Currently, NeuroMHE is trained through gradient descent, with its gradient computed recursively using a Kalman filter. This paper proposes a trust-region policy optimization method for training NeuroMHE. We achieve this by providing the second-order derivatives of MHE, referred to as the MHE Hessian. Remarkably, we show that much of computation already used to obtain the gradient, especially the Kalman filter, can be efficiently reused to compute the MHE Hessian. This offers linear computational complexity relative to the MHE horizon. As a case study, we evaluate the proposed trust region NeuroMHE on real quadrotor flight data for disturbance estimation. Our approach demonstrates highly efficient training in under 5 min using only 100 data points. It outperforms a state-of-the-art neural estimator by up to 68.1% in force estimation accuracy, utilizing only 1.4% of its network parameters. Furthermore, our method showcases enhanced robustness to network initialization compared to the gradient descent counterpart.
翻译:精确的扰动估计对于机器人的安全操作至关重要。近期提出的神经移动视界估计(NeuroMHE)通过使用便携式神经网络对MHE的权重进行建模,在进一步推升精度和效率边界方面展现出潜力。当前,NeuroMHE通过梯度下降法训练,其梯度通过卡尔曼滤波器递归计算。本文提出了一种基于信任域策略优化的NeuroMHE训练方法。我们通过提供MHE的二阶导数(即MHE海森矩阵)来实现这一点。值得注意的是,我们证明:用于计算梯度的大部分计算(尤其是卡尔曼滤波器)可以被高效复用,以计算MHE海森矩阵,这提供了与MHE视界长度成线性关系的计算复杂度。作为案例研究,我们在真实四旋翼飞行数据上评估了所提出的信任域NeuroMHE用于扰动估计的性能。该方法仅使用100个数据点,在5分钟内即可完成高效训练。在力估计精度上,它相比现有最优神经估计器提升了高达68.1%,且仅使用了其1.4%的网络参数。此外,与梯度下降方法相比,我们的方法展现出对网络初始化更强的鲁棒性。