Decision analysis deals with modeling and enhancing decision processes. A principal challenge in improving behavior is in obtaining a transparent description of existing behavior in the first place. In this paper, we develop an expressive, unifying perspective on inverse decision modeling: a framework for learning parameterized representations of sequential decision behavior. First, we formalize the forward problem (as a normative standard), subsuming common classes of control behavior. Second, we use this to formalize the inverse problem (as a descriptive model), generalizing existing work on imitation/reward learning -- while opening up a much broader class of research problems in behavior representation. Finally, we instantiate this approach with an example (inverse bounded rational control), illustrating how this structure enables learning (interpretable) representations of (bounded) rationality -- while naturally capturing intuitive notions of suboptimal actions, biased beliefs, and imperfect knowledge of environments.
翻译:决策分析涉及建模和优化决策过程。改善行为的一个主要挑战在于首先要获得对现有行为的透明描述。在本文中,我们提出了一种具有表现力且统一的逆向决策建模视角:一个学习序贯决策行为参数化表示的框架。首先,我们将正向问题(作为规范性标准)形式化,纳入了常见的控制行为类别。其次,我们利用这一框架将逆向问题(作为描述性模型)形式化,推广了现有的模仿/奖励学习工作——同时开辟了行为表示中更广泛的研究问题类别。最后,我们通过一个实例(逆向有界理性控制)展示了这一方法如何实现学习(有界)理性的(可解释)表示——同时自然地捕捉了次优行动、有偏信念和环境知识不完善等直观概念。