We anticipate increased instances of humans and AI systems working together in what we refer to as a hybrid team. The increase in collaboration is expected as AI systems gain proficiency and their adoption becomes more widespread. However, their behavior is not error-free, making hybrid teams a very suitable solution. As such, we consider methods for improving performance for these teams of humans and AI systems. For hybrid teams, we will refer to both the humans and AI systems as agents. To improve team performance over that seen for agents operating individually, we propose a manager which learns, through a standard Reinforcement Learning scheme, how to best delegate, over time, the responsibility of taking a decision to any of the agents. We further guide the manager's learning so they also minimize how many changes in delegation are made resulting from undesirable team behavior. We demonstrate the optimality of our manager's performance in several grid environments which include failure states which terminate an episode and should be avoided. We perform our experiments with teams of agents with varying degrees of acceptable risk, in the form of proximity to a failure state, and measure the manager's ability to make effective delegation decisions with respect to its own risk-based constraints, then compare these to the optimal decisions. Our results show our manager can successfully learn desirable delegations which result in team paths near/exactly optimal with respect to path length and number of delegations.
翻译:我们预见人类与AI系统在所谓的混合团队中共同协作的场景将日益增多。随着AI系统能力提升及其应用普及,这种协作预期会进一步增长。然而,AI系统的行为并非零错误,这使混合团队成为非常合适的解决方案。为此,我们研究如何提升这类人类与AI系统混合团队的绩效。在混合团队中,我们将人类和AI系统统称为智能体。为改善团队绩效,使其超越智能体单独运作的表现,我们提出一种管理者框架——该管理者通过标准强化学习机制,学习如何随时间推移,将决策权最优地分配给任一智能体。我们进一步引导管理者的学习过程,使其同时最小化因不良团队行为导致的委派变更次数。我们在多个包含需规避的终止性失败状态的网格环境中,验证了该管理者绩效的最优性。实验采用风险承受度各异(以与失败状态的距离表征)的智能体团队,衡量管理者在自身风险约束下做出有效委派决策的能力,并与最优决策进行对比。结果表明,我们的管理者能成功学习到理想的委派策略,使团队路径在路径长度和委派次数上接近或达到最优。