We study feedback controller synthesis for reach-avoid control of discrete-time, linear time-invariant (LTI) systems with Gaussian process and measurement noise. The problem is to compute a controller such that, with at least some required probability, the system reaches a desired goal state in finite time while avoiding unsafe states. Due to stochasticity and nonconvexity, this problem does not admit exact algorithmic or closed-form solutions in general. Our key contribution is a correct-by-construction controller synthesis scheme based on a finite-state abstraction of a Gaussian belief over the unmeasured state, obtained using a Kalman filter. We formalize this abstraction as a Markov decision process (MDP). To be robust against numerical imprecision in approximating transition probabilities, we use MDPs with intervals of transition probabilities. By construction, any policy on the abstraction can be refined into a piecewise linear feedback controller for the LTI system. We prove that the closed-loop LTI system under this controller satisfies the reach-avoid problem with at least the required probability. The numerical experiments show that our method is able to solve reach-avoid problems for systems with up to 6D state spaces, and with control input constraints that cannot be handled by methods such as the rapidly-exploring random belief trees (RRBT).
翻译:我们研究离散时间、线性时不变(LTI)系统在存在高斯过程噪声和测量噪声条件下的可达-避碰控制的反馈控制器综合问题。该问题旨在计算一个控制器,使得系统在有限时间内以不低于指定概率到达期望目标状态,同时避免进入不安全状态。由于随机性和非凸性,该问题通常无法求得精确算法解或闭式解。我们的核心贡献在于提出一种基于有限状态抽象的正确性保证控制器综合方案:利用卡尔曼滤波器对未测量状态的高斯置信度进行抽象,并将其形式化为马尔可夫决策过程(MDP)。为应对转移概率近似计算中的数值误差,我们采用区间转移概率的MDP模型。通过构造,该抽象上的任意策略均可精化为LTI系统的分段线性反馈控制器。我们证明,在该控制器作用下,闭环LTI系统能以不低于所需概率满足可达-避碰要求。数值实验表明,我们的方法可解决状态空间维度高达6维、且存在控制输入约束的系统可达-避碰问题,而此类约束无法通过快速探索随机置信树(RRBT)等方法处理。