To this day, turn-taking models determining voice agents' conduct have been examined primarily from a technical point of view, while the ways in which they emerge as interactional constraints or resources for human conversationalists in situ remain underexplored. Drawing on a detailed analysis of corpora of naturalistic data, we document how humans' conduct was produced in reference to the ever-present risk that, each time they spoke, their talk might trigger a new uncalled-for contribution from the artificial agent. We examine this phenomenon in interactions involving rule-based robots from a 'pre-LLM era' as well as the most recent voice agents. This 'omnirelevance of human speech' (i.e., the possibility that a conversational agent may erroneously respond to any speech it detects) emerged as a constitutive feature of these human-agent encounters. We describe some of the practices through which humans managed these artificial agents' turn-taking conduct. Given recent improvements in voice capture technology, we ask whether this 'omnirelevance of human speech' weighs even more heavily on human practices today than in the past.
翻译:暂无翻译