Interaction and cooperation with humans are overarching aspirations of artificial intelligence (AI) research. Recent studies demonstrate that AI agents trained with deep reinforcement learning are capable of collaborating with humans. These studies primarily evaluate human compatibility through "objective" metrics such as task performance, obscuring potential variation in the levels of trust and subjective preference that different agents garner. To better understand the factors shaping subjective preferences in human-agent cooperation, we train deep reinforcement learning agents in Coins, a two-player social dilemma. We recruit $N = 501$ participants for a human-agent cooperation study and measure their impressions of the agents they encounter. Participants' perceptions of warmth and competence predict their stated preferences for different agents, above and beyond objective performance metrics. Drawing inspiration from social science and biology research, we subsequently implement a new ``partner choice'' framework to elicit revealed preferences: after playing an episode with an agent, participants are asked whether they would like to play the next episode with the same agent or to play alone. As with stated preferences, social perception better predicts participants' revealed preferences than does objective performance. Given these results, we recommend human-agent interaction researchers routinely incorporate the measurement of social perception and subjective preferences into their studies.
翻译:与人类进行交互与合作是人工智能研究的核心目标。近期研究表明,基于深度强化学习训练的智能体能够与人类协作。这些研究主要通过任务绩效等"客观"指标评估人类兼容性,但忽略了不同智能体所引发的信任水平与主观偏好的潜在差异。为深入理解人机合作中塑造主观偏好的关键因素,我们在双人社会困境游戏"Coins"中训练了深度强化学习智能体。我们招募了$N = 501$名参与者进行人机合作实验,并测量他们对所遇智能体的印象。结果显示,参与者对智能体的温暖与能力感知能显著预测其陈述偏好,其解释力超越了客观绩效指标。受社会科学与生物学研究的启发,我们进一步构建了"伙伴选择"框架以揭示揭示偏好:参与者在完成一轮游戏后,可选择与同一智能体继续下一轮,或改为独立游戏。与陈述偏好类似,社会感知比客观绩效更能预测参与者的揭示偏好。基于此,我们建议人机交互研究者在实验设计中常规化社会感知与主观偏好的测量。