Active Probing and Influencing Human Behaviors Via Autonomous Agents

Autonomous agents (robots) face tremendous challenges while interacting with heterogeneous human agents in close proximity. One of these challenges is that the autonomous agent does not have an accurate model tailored to the specific human that the autonomous agent is interacting with, which could sometimes result in inefficient human-robot interaction and suboptimal system dynamics. Developing an online method to enable the autonomous agent to learn information about the human model is therefore an ongoing research goal. Existing approaches position the robot as a passive learner in the environment to observe the physical states and the associated human response. This passive design, however, only allows the robot to obtain information that the human chooses to exhibit, which sometimes doesn't capture the human's full intention. In this work, we present an online optimization-based probing procedure for the autonomous agent to clarify its belief about the human model in an active manner. By optimizing an information radius, the autonomous agent chooses the action that most challenges its current conviction. This procedure allows the autonomous agent to actively probe the human agents to reveal information that's previously unavailable to the autonomous agent. With this gathered information, the autonomous agent can interactively influence the human agent for some designated objectives. Our main contributions include a coherent theoretical framework that unifies the probing and influence procedures and two case studies in autonomous driving that show how active probing can help to create better participant experience during influence, like higher efficiency or less perturbations.

翻译：自主代理（机器人）在与异质性人类代理近距离互动时面临巨大挑战。其中一个关键挑战在于，自主代理缺乏针对特定交互对象的精准人类模型，这可能导致人机协作效率低下及系统动态次优化。因此，开发一种能让自主代理在线获取人类模型信息的实时方法成为持续研究目标。现有方法将机器人定位为环境中的被动学习者，通过观测物理状态及相关人类反应来学习。但这种被动设计仅能获取人类主动展示的信息，有时无法捕捉其完整意图。本文提出一种基于在线优化的主动探测方法，使自主代理能够主动澄清对人类模型的认知不确定性。通过优化信息半径，自主代理会选择最挑战当前认知的行动。该程序使自主代理能主动探测人类代理，从而获取先前无法获得的隐藏信息。基于这些收集的信息，自主代理可为实现特定目标而交互性地影响人类代理。我们的主要贡献包括：建立统一探测与影响过程的理论框架，以及在自动驾驶领域的两个案例研究，证明主动探测如何帮助在影响过程中创造更优的参与者体验（如更高效率或更少扰动）。