Large language models (LLMs) exhibit a wide range of promising capabilities -- from step-by-step planning to commonsense reasoning -- that may provide utility for robots, but remain prone to confidently hallucinated predictions. In this work, we present KnowNo, which is a framework for measuring and aligning the uncertainty of LLM-based planners such that they know when they don't know and ask for help when needed. KnowNo builds on the theory of conformal prediction to provide statistical guarantees on task completion while minimizing human help in complex multi-step planning settings. Experiments across a variety of simulated and real robot setups that involve tasks with different modes of ambiguity (e.g., from spatial to numeric uncertainties, from human preferences to Winograd schemas) show that KnowNo performs favorably over modern baselines (which may involve ensembles or extensive prompt tuning) in terms of improving efficiency and autonomy, while providing formal assurances. KnowNo can be used with LLMs out of the box without model-finetuning, and suggests a promising lightweight approach to modeling uncertainty that can complement and scale with the growing capabilities of foundation models. Website: https://robot-help.github.io
翻译:大语言模型展现出广泛的有前景的能力——从逐步规划到常识推理——这些能力可能为机器人提供实用性,但依然容易产生自信的幻觉式预测。在这项工作中,我们提出了KnowNo,这是一个用于测量和对齐基于大语言模型的规划器不确定性的框架,使其在不确定时自知,并在需要时寻求帮助。KnowNo基于共形预测理论,在复杂的多步规划场景中,提供任务完成的统计保证,同时最小化人类帮助。在涉及不同模糊模式的多种模拟和真实机器人设置上的实验(例如,从空间到数值不确定性,从人类偏好到Winograd模式)表明,KnowNo在提高效率和自主性方面优于现代基线方法(可能涉及集成或大量提示调优),同时提供形式化保证。KnowNo可直接与大语言模型一起使用,无需模型微调,并提出了一种有前景的轻量级不确定性建模方法,能够补充并随基础模型不断增长的能力而扩展。网站:https://robot-help.github.io