Conversation is the subject of increasing interest in the social, cognitive, and computational sciences. And yet, as conversational datasets continue to increase in size and complexity, researchers lack scalable methods to segment speech-to-text transcripts into conversational turns-the basic building blocks of social interaction. We discuss this challenge and then introduce "NaturalTurn," a turn segmentation algorithm designed to accurately capture the dynamics of naturalistic exchange. NaturalTurn operates by distinguishing speakers' primary conversational turns from listeners' secondary utterances, such as backchannels, brief interjections, and other forms of parallel speech that characterize conversation. Using data from a large conversation corpus, we show how NaturalTurn-derived transcripts demonstrate favorable statistical and inferential characteristics compared to transcripts derived from existing methods. The NaturalTurn algorithm represents an improvement in machine-generated transcript processing methods, or "turn models" that will enable researchers to link turn-taking dynamics with the broader outcomes that result from social interaction, a central goal of conversation science.
翻译:对话在社会科学、认知科学和计算科学领域正受到日益增长的关注。然而,随着对话数据集的规模和复杂性持续增加,研究人员缺乏可扩展的方法将语音转文本的转录内容分割为对话轮次——社会互动的基本构建单元。我们讨论了这一挑战,随后介绍了"NaturalTurn",这是一种旨在准确捕捉自然交流动态的轮次分割算法。NaturalTurn通过区分说话者的主要对话轮次与倾听者的次要话语(如反馈信号、简短插话以及其他形式的并行言语,这些是对话的典型特征)来运作。利用大型对话语料库的数据,我们展示了与现有方法生成的转录文本相比,基于NaturalTurn的转录文本在统计和推断特性上表现出更优的特征。NaturalTurn算法代表了机器生成转录处理方法(或称"轮次模型")的改进,将使研究人员能够将话轮转换动态与社会互动所产生的更广泛结果联系起来,这是对话科学的一个核心目标。