This work proposes a decision-making framework for partially observable systems in continuous time with discrete state and action spaces. As optimal decision-making becomes intractable for large state spaces we employ approximation methods for the filtering and the control problem that scale well with an increasing number of states. Specifically, we approximate the high-dimensional filtering distribution by projecting it onto a parametric family of distributions, and integrate it into a control heuristic based on the fully observable system to obtain a scalable policy. We demonstrate the effectiveness of our approach on several partially observed systems, including queueing systems and chemical reaction networks.
翻译:本文提出了一种针对连续时间下具有离散状态和动作空间的部分可观测系统的决策框架。由于最优决策在大状态空间中难以处理,我们采用了滤波与控制问题的近似方法,这些方法在状态数量增加时具有良好的扩展性。具体而言,我们通过将高维滤波分布投影到参数化分布族上进行近似,并将其集成到基于完全可观测系统的控制启发式方法中,从而获得可扩展的策略。我们通过在若干部分可观测系统(包括排队系统和化学反应网络)上的实验验证了该方法的有效性。