Large language models (LLMs) gained immense popularity due to their impressive capabilities in unstructured conversations. However, they underperform compared to previous approaches in task-oriented dialogue (TOD), wherein reasoning and accessing external information are crucial. Empowering LLMs with advanced prompting strategies such as reasoning and acting (ReAct) has shown promise in solving complex tasks traditionally requiring reinforcement learning. In this work, we apply the ReAct strategy to guide LLMs performing TOD. We evaluate ReAct-based LLMs (ReAct-LLMs) both in simulation and with real users. While ReAct-LLMs seem to underperform state-of-the-art approaches in simulation, human evaluation indicates higher user satisfaction rate compared to handcrafted systems despite having a lower success rate.
翻译:大语言模型(LLMs)凭借其在非结构化对话中展现出的卓越能力而广受欢迎。然而,在任务导向对话(TOD)场景中,由于推理与外部信息调用至关重要,其表现却逊于先前的方法。通过采用推理与行动(ReAct)等高级提示策略增强大语言模型,已在传统上需要强化学习的复杂任务中展现出潜力。本研究将ReAct策略应用于引导大语言模型执行TOD任务。我们通过模拟实验和真实用户评估对基于ReAct的大语言模型(ReAct-LLMs)进行了全面测试。尽管模拟结果显示ReAct-LLMs的表现可能不及当前最优方法,但人工评估表明,尽管其任务成功率较低,用户满意度却高于人工构建的对话系统。