This work proposes a decision-making framework for partially observable systems in continuous time with discrete state and action spaces. As optimal decision-making becomes intractable for large state spaces we employ approximation methods for the filtering and the control problem that scale well with an increasing number of states. Specifically, we approximate the high-dimensional filtering distribution by projecting it onto a parametric family of distributions, and integrate it into a control heuristic based on the fully observable system to obtain a scalable policy. We demonstrate the effectiveness of our approach on several partially observed systems, including queueing systems and chemical reaction networks.
翻译:本文提出了一种面向连续时间、离散状态和动作空间的部分可观测系统的决策框架。由于大规模状态空间下的最优决策问题难以求解,我们采用针对滤波与控制问题的近似方法,这些方法能够随着状态数量的增加而具有良好的可扩展性。具体而言,我们通过将高维滤波分布投影到参数化分布族上对其进行近似,并将其融入基于完全可观测系统的控制启发式策略中,从而获得可扩展的策略。我们在多个部分可观测系统(包括排队系统和化学反应网络)上验证了该方法的有效性。