Traditional methods for eliciting people's opinions face a trade-off between depth and scale: structured surveys enable large-scale data collection but limit respondents' ability to voice their opinions in their own words, while conversational interviews provide deeper insights but are resource-intensive. This study explores the potential of replacing human interviewers with large language models (LLMs) to conduct scalable conversational interviews. Our goal is to assess the performance of AI Conversational Interviewing and to identify opportunities for improvement in a controlled environment. We conducted a small-scale, in-depth study with university students who were randomly assigned to a conversational interview by either AI or human interviewers, both employing identical questionnaires on political topics. Various quantitative and qualitative measures assessed interviewer adherence to guidelines, response quality, participant engagement, and overall interview efficacy. The findings indicate the viability of AI Conversational Interviewing in producing quality data comparable to traditional methods, with the added benefit of scalability. We publish our data and materials for re-use and present specific recommendations for effective implementation.
翻译:传统获取公众意见的方法面临深度与规模之间的权衡:结构化调查能够实现大规模数据收集,但限制了受访者用自身语言表达观点的能力;而对话式访谈虽能提供更深入的洞察,却需要大量资源投入。本研究探索了用大型语言模型(LLMs)替代人类访谈者进行可扩展对话式访谈的潜力。我们的目标是在受控环境中评估AI对话式访谈的表现,并识别其改进空间。我们以大学生为对象开展了小规模深度研究,参与者被随机分配接受由AI或人类访谈者进行的对话式访谈,两者均采用相同的政治议题问卷。通过多种定量与定性指标,我们评估了访谈者对指南的遵循程度、回答质量、参与者参与度及整体访谈效能。研究结果表明,AI对话式访谈在生成与传统方法质量相当的数据方面具有可行性,并兼具可扩展性的优势。我们公开了研究数据与材料以供复用,并就有效实施方案提出了具体建议。