When assisting human users in reinforcement learning (RL), we can represent users as RL agents and study key parameters, called \emph{user traits}, to inform intervention design. We study the relationship between user behaviors (policy classes) and user traits. Given an environment, we introduce an intuitive tool for studying the breakdown of "user types": broad sets of traits that result in the same behavior. We show that seemingly different real-world environments admit the same set of user types and formalize this observation as an equivalence relation defined on environments. By transferring intervention design between environments within the same equivalence class, we can help rapidly personalize interventions.
翻译:在协助人类用户进行强化学习时,我们可以将用户表示为强化学习智能体,并研究关键参数(称为用户特征)以指导干预设计。我们研究了用户行为(策略类别)与用户特征之间的关系。针对特定环境,我们引入了一种直观工具,用于分析"用户类型"的划分——即导致相同行为的广泛特征集合。研究表明,看似不同的真实世界环境可能包含相同的用户类型集合,并将这一观察形式化为环境上的等价关系。通过在同一等价类中的不同环境之间迁移干预设计,我们可以帮助快速个性化干预措施。