多折扣因子折扣和自动机 (Discounted-Sum Automata with Multiple Discount Factors)

Discounting the influence of future events is a key paradigm in economics and it is widely used in computer-science models, such as games, Markov decision processes (MDPs), reinforcement learning, and automata. While a single game or MDP may allow for several different discount factors, nondeterministic discounted-sum automata (NDAs) were only studied with respect to a single discount factor. It is known that every class of NDAs with an integer as the discount factor has good computational properties: It is closed under determinization and under the algebraic operations min, max, addition, and subtraction, and there are algorithms for its basic decision problems, such as automata equivalence and containment. Extending the integer discount factor to an arbitrary rational number, loses most of these good properties. We define and analyze nondeterministic discounted-sum automata in which each transition can have a different integral discount factor (integral NMDAs). We show that integral NMDAs with an arbitrary choice of discount factors are not closed under determinization and under algebraic operations and that their containment problem is undecidable. We then define and analyze a restricted class of integral NMDAs, which we call tidy NMDAs, in which the choice of discount factors depends on the prefix of the word read so far. Among their special cases are NMDAs that correlate discount factors to actions (alphabet letters) or to the elapsed time. We show that for every function $\theta$ that defines the choice of discount factors, the class of $\theta$-NMDAs enjoys all of the above good properties of NDAs with a single integral discount factor, as well as the same complexity of the required decision problems. Tidy NMDAs are also as expressive as deterministic integral NMDAs with an arbitrary choice of discount factors.

翻译：对未来事件的影响进行折扣是经济学中的一个关键范式，并广泛应用于计算机科学模型，如博弈、马尔可夫决策过程（MDPs）、强化学习和自动机。虽然单个博弈或MDP可能允许多个不同的折扣因子，但非确定性折扣和自动机（NDAs）此前仅针对单一折扣因子进行研究。已知的是，每个以整数作为折扣因子的NDA类都具有良好的计算性质：它在确定性化以及代数运算（如最小值、最大值、加法和减法）下是封闭的，并且存在算法用于解决其基本决策问题，例如自动机等价性和包含性。将整数折扣因子扩展到任意有理数会失去大部分这些良好性质。我们定义并分析了一种非确定性折扣和自动机，其中每个转移可以具有不同的整数折扣因子（整数NMDAs）。我们证明了具有任意折扣因子选择的整数NMDAs在确定性化和代数运算下不是封闭的，并且它们的包含问题是不可判定的。随后，我们定义并分析了一类受限的整数NMDAs，称之为整洁NMDAs，其中折扣因子的选择取决于已读取单词的前缀。其特例包括将折扣因子与动作（字母表字母）或经过时间相关联的NMDAs。我们证明，对于每个定义折扣因子选择方式的函数$\theta$，$\theta$-NMDA类享有上述具有单一整数折扣因子的NDA的所有良好性质，以及所需决策问题的相同复杂度。整洁NMDAs的表达能力也与具有任意折扣因子选择的确定性整数NMDAs相当。