To this day, turn-taking models determining voice agents' conduct have been examined primarily from a technical point of view, while the ways in which they emerge as interactional constraints or resources for human conversationalists in situ remain underexplored. Drawing on a detailed analysis of corpora of naturalistic data, we document how humans' conduct was produced in reference to the ever-present risk that, each time they spoke, their talk might trigger a new uncalled-for contribution from the artificial agent. We examine this phenomenon in interactions involving rule-based robots from a 'pre-LLM era' as well as the most recent voice agents. This 'omnirelevance of human speech' (i.e., the possibility that a conversational agent may erroneously respond to any speech it detects) emerged as a constitutive feature of these human-agent encounters. We describe some of the practices through which humans managed these artificial agents' turn-taking conduct. Given recent improvements in voice capture technology, we ask whether this 'omnirelevance of human speech' weighs even more heavily on human practices today than in the past.
翻译:至今为止,决定语音代理行为的轮换模型主要从技术角度进行研究,而它们在现场人类对话者中作为互动约束或资源的方式仍未被充分探索。基于对自然语料库的详细分析,我们记录了人类行为如何参照一个随时存在的风险产生:即每次他们说话时,其发言可能触发来自人工代理的新一轮不当回应。我们研究了涉及来自“前大语言模型时代”的基于规则的机器人以及最新语音代理的互动中的这一现象。这种“人类语音的全相关性”(即对话代理可能错误地对其检测到的任何语音做出反应的可能性)成为这些人类-代理相遇的一个构成性特征。我们描述了人类管理这些人工代理轮换行为的一些实践。鉴于语音捕捉技术的最新进展,我们提出疑问:这种“人类语音的全相关性”在今天是否对人类实践施加了比过去更大的影响。