Designing efficient and rigorous numerical methods for sequential decision-making under uncertainty is a difficult problem that arises in many applications frameworks. In this paper we focus on the numerical solution of a subclass of impulse control problem for piecewise deterministic Markov process (PDMP) when the jump times are hidden. We first state the problem as a partially observed Markov decision process (POMDP) on a continuous state space and with controlled transition kernels corresponding to some specific skeleton chains of the PDMP. Then we proceed to build a numerically tractable approximation of the POMDP by tailor-made discretizations of the state spaces. The main difficulty in evaluating the discretization error come from the possible random or boundary jumps of the PDMP between consecutive epochs of the POMDP and requires special care. Finally we extensively discuss the practical construction of discretization grids and illustrate our method on simulations.
翻译:在不确定条件下设计高效且严谨的序贯决策数值方法是许多应用框架中面临的难题。本文聚焦于一类跳变时间隐藏的分段确定性马尔可夫过程(PDMP)脉冲控制问题的数值求解。首先将该问题表述为连续状态空间上的部分观测马尔可夫决策过程(POMDP),其受控转移核对应于PDMP的特定骨架链。随后通过定制化的状态空间离散化方法构建该POMDP的数值可处理近似。评估离散化误差的主要困难源于PDMP在POMDP相邻时间节点间可能存在的随机或边界跳变,需予以特殊处理。最后详细讨论了离散化网格的实用构造方法,并通过仿真验证了所提方法的有效性。