We introduce stochastic decision Petri nets (SDPNs), which are a form of stochastic Petri nets equipped with rewards and a control mechanism via the deactivation of controllable transitions. Such nets can be translated into Markov decision processes (MDPs), potentially leading to a combinatorial explosion in the number of states due to concurrency. Hence we restrict ourselves to instances where nets are either safe, free-choice and acyclic nets (SAFC nets) or even occurrence nets and policies are defined by a constant deactivation pattern. We obtain complexity-theoretic results for such cases via a close connection to Bayesian networks, in particular we show that for SAFC nets the question whether there is a policy guaranteeing a reward above a certain threshold is $\mathsf{NP}^\mathsf{PP}$-complete. We also introduce a partial-order procedure which uses an SMT solver to address this problem.
翻译:我们提出随机决策Petri网(SDPNs),这是一种配备奖励并通过禁用可控变迁实现控制机制的随机Petri网形式。此类网可转化为马尔可夫决策过程(MDPs),但并发性可能导致状态数量组合爆炸。因此,我们将研究范围限定于安全、自由选择且无环的网(SAFC网)或出现网,并采用恒定禁用模式定义策略。通过与贝叶斯网络的紧密联系,我们获得了此类情形的复杂度理论结果,特别地证明了对于SAFC网,判定是否存在保证奖励超过给定阈值的策略是$\mathsf{NP}^\mathsf{PP}$-完全问题。我们还引入了一种基于SMT求解器的偏序过程来解决该问题。