Users can discuss a wide range of topics with large language models (LLMs), but they do not always prefer solving problems or getting information through lengthy conversations. This raises an intriguing HCI question: How does instructing LLMs to engage in longer or shorter conversations affect conversation quality? In this paper, we developed two Slack chatbots using GPT-4 with the ability to vary conversation lengths and conducted a user study. Participants asked the chatbots both highly and less conversable questions, engaging in dialogues with 0, 3, 5, and 7 conversational turns. We found that the conversation quality does not differ drastically across different conditions, while participants had mixed reactions. Our study demonstrates LLMs' ability to change conversation length and the potential benefits for users resulting from such changes, but we caution that changes in text form may not necessarily imply changes in quality or content.
翻译:用户可以与大型语言模型(LLMs)讨论广泛的话题,但他们并不总是偏好通过冗长的对话来解决问题或获取信息。这引发了人机交互领域的一个有趣问题:指令LLMs进行较长或较短的对话会如何影响对话质量?本文利用GPT-4开发了两款Slack聊天机器人,使其能够调整对话长度,并开展了一项用户研究。参与者向聊天机器人提出高对话性和低对话性问题,分别进行0、3、5和7轮对话。研究发现,不同条件下的对话质量并无显著差异,但参与者的反应存在分歧。本研究证明了LLMs调整对话长度的能力,以及此类变化可能为用户带来的潜在益处,但需要指出的是,文本形式的变化不一定意味着质量或内容的改变。