To handle ambiguous and open-ended requests, Large Language Models (LLMs) are increasingly trained to interact with users to surface intents they have not yet expressed (e.g., ask clarification questions). However, users are often ambiguous because they have not yet formed their intents: they must observe and explore outcomes to discover what they want. Simply asking "what kind of tone do you want?" fails when users themselves do not know. We introduce DiscoverLLM, a novel and generalizable framework that trains LLMs to help users form and discover their intents. Central to our approach is a novel user simulator that models cognitive state with a hierarchy of intents that progressively concretize as the model surfaces relevant options -- where the degree of concretization serves as a reward signal that models can be trained to optimize. Resulting models learn to collaborate with users by adaptively diverging (i.e., explore options) when intents are unclear, and converging (i.e., refine and implement) when intents concretize. Across proposed interactive benchmarks in creative writing, technical writing, and SVG drawing, DiscoverLLM achieves over 10% higher task performance while reducing conversation length by up to 40%. In a user study with 75 human participants, DiscoverLLM improved conversation satisfaction and efficiency compared to baselines.
翻译:为处理模糊且开放式的请求,大型语言模型(LLM)正被训练用于与用户交互,以揭示其尚未表达的意图(例如提出澄清性问题)。然而,用户表达模糊往往是因为其意图尚未形成:他们需要通过观察和探索结果来发现自身需求。当用户自身也不明确时,单纯询问“您想要何种风格?”是无效的。本文提出DiscoverLLM——一个新颖且可泛化的框架,通过训练LLM帮助用户形成并发现其意图。该方法的核心理念是一个创新的用户模拟器,它通过层级化意图对认知状态进行建模:随着模型呈现相关选项,意图会逐步具体化,而具体化程度可作为可优化的奖励信号。经训练的模型学会与用户协作:在意图不明确时自适应地发散(即探索选项),在意图具体化时则收敛(即细化并执行)。在创意写作、技术文档撰写和SVG绘图等交互式基准测试中,DiscoverLLM在将对话长度缩短达40%的同时,实现了超过10%的任务性能提升。在包含75名人类参与者的用户研究中,相较于基线模型,DiscoverLLM显著提升了对话满意度与效率。