In this paper, we study the problem of online tracking in linear control systems, where the objective is to follow a moving target. Unlike classical tracking control, the target is unknown, non-stationary, and its state is revealed sequentially, thus, fitting the framework of online non-stochastic control. We consider the case of quadratic costs and propose a new algorithm, called predictive linear online tracking (PLOT). The algorithm uses recursive least squares with exponential forgetting to learn a time-varying dynamic model of the target. The learned model is used in the optimal policy under the framework of receding horizon control. We show the dynamic regret of PLOT scales with $\mathcal{O}(\sqrt{TV_T})$, where $V_T$ is the total variation of the target dynamics and $T$ is the time horizon. Unlike prior work, our theoretical results hold for non-stationary targets. We implement PLOT on a real quadrotor and provide open-source software, thus, showcasing one of the first successful applications of online control methods on real hardware.
翻译:本文研究线性控制系统中在线跟踪问题,其目标是跟踪一个移动目标。与经典跟踪控制不同,目标未知、非平稳,且其状态顺序揭示,因此适用于在线非随机控制框架。我们考虑二次成本的情况,并提出一种新算法,称为预测性线性在线跟踪(PLOT)。该算法使用带指数遗忘的递归最小二乘法来学习目标的时变动态模型。将学习到的模型用于滚动时域控制框架下的最优策略。我们证明PLOT的动态遗憾为$\mathcal{O}(\sqrt{TV_T})$,其中$V_T$是目标动态的总变差,$T$是时间范围。与先前工作不同,我们的理论结果适用于非平稳目标。我们在真实四旋翼飞行器上实现PLOT,并提供开源软件,从而展示了在线控制方法在真实硬件上首批成功应用之一。