We study the complexity of reductions for weighted reachability in parametric Markov decision processes. That is, we say a state p is never worse than q if for all valuations of the polynomial indeterminates it is the case that the maximal expected weight that can be reached from p is greater than the same value from q. In terms of computational complexity, we establish that determining whether p is never worse than q is coETR-complete. On the positive side, we give a polynomial-time algorithm to compute the equivalence classes of the order we study for Markov chains. Additionally, we describe and implement two inference rules to under-approximate the never-worse relation and empirically show that it can be used as an efficient preprocessing step for the analysis of large Markov decision processes.
翻译:我们研究了参数化马尔可夫决策过程中加权可达性约简的复杂度。具体地,若对于所有多项式不定元的赋值,从状态p可达的最大期望权重大于从状态q可达的同一值,则称状态p绝不劣于q。在计算复杂度方面,我们证明了判断p是否绝不劣于q是coETR完备的。在正面结果中,我们给出了一个多项式时间算法,用于计算马尔可夫链中我们研究序关系的等价类。此外,我们描述并实现了两条推理规则以对绝不劣关系进行下近似,并通过实验表明其可作为大规模马尔可夫决策过程分析的高效预处理步骤。