Qualitative methods like interviews produce richer data in comparison with quantitative surveys, but are difficult to scale. Switching from web-based questionnaires to interactive chatbots offers a compromise, improving user engagement and response quality. Uptake remains limited, however, because of differences in users' expectations versus the capabilities of natural language processing methods. In this study, we evaluate the potential of large language models (LLMs) to support an information elicitation chatbot that narrows this "gulf of expectations" (Luger & Sellen 2016). We conduct a user study in which participants (N = 399) were randomly assigned to interact with a rule-based chatbot versus one of two LLM-augmented chatbots. We observe limited evidence of differences in user engagement or response richness between conditions. However, the addition of LLM-based dynamic probing skills produces significant improvements in both quantitative and qualitative measures of user experience, consistent with a narrowing of the expectations gulf.
翻译:定性方法如访谈产生的数据比定量调查更为丰富,但难以规模化。从基于网络的问卷转向交互式聊天机器人提供了一种折中方案,可改善用户参与度和回答质量。然而,由于用户期望与自然语言处理方法能力之间的差异,其应用仍然有限。在本研究中,我们评估了大型语言模型(LLMs)在支持信息获取聊天机器人方面的潜力,以缩小这种“期望鸿沟”(Luger & Sellen 2016)。我们进行了一项用户研究,参与者(N=399)被随机分配到与基于规则的聊天机器人或两个LLM增强型聊天机器人之一互动。我们观察到,不同条件下用户参与度或回答丰富性的差异证据有限。然而,添加基于LLM的动态探询技能在用户体验的定量和定性指标上均产生了显著改善,这与期望鸿沟的缩小相一致。