Compass vs Railway Tracks: Unpacking User Mental Models for Communicating Long-Horizon Work to Humans vs. AI

As AI systems (foundation models, agentic systems) grow increasingly capable of operating for minutes or hours at a time, users' prompts are transforming into highly detailed, elaborate specifications for the AI to autonomously work on. While interactive prompting has been extensively studied, comparatively less is known about how people communicate specifications for these types of long-horizon tasks. In a qualitative study in which 16 professionals drafted specifications for both a human colleague and an AI, we found a core divergence in how people specified problems to people versus AI: people approached communication with humans as providing a "compass", offering high-level intent to encourage flexible exploration. In contrast, communication with AI resembled painstakingly laying down "railway tracks": rigid, exhaustive instructions to minimize ambiguity and deviation. This strategy was driven by a perception that current AI has limited ability to infer intent, prioritize, and make judgments on its own. When envisioning an idealAI collaborator, users expressed a desire for a hybrid between current AI and human colleagues: a collaborator that blends AI's efficiency and large context window with the critical thinking and agency of a human colleague. We discuss design implications for future AI systems, proposing that they align on outcomes through generated rough drafts, verify feasibility via end-to-end "test runs," and monitor execution through intelligent check-ins, ultimately transforming AI from a passive instruction-follower into a reliable collaborator for ambiguous, long-horizon problems.

翻译：随着AI系统（基础模型、智能体系统）日益具备连续运行数分钟乃至数小时的能力，用户的提示词正演变为高度详尽、精细的规范说明，供AI自主执行。尽管交互式提示技术已得到广泛研究，但人们对如何传达此类长周期任务的规范说明仍知之甚少。在一项涉及16位专业人士分别向人类同事和AI起草任务说明的定性研究中，我们发现人们在向人类与AI阐述问题时存在核心差异：与人类沟通时，人们倾向于提供“罗盘式”指引，即通过高层意图说明鼓励灵活探索；而与AI沟通时，则类似于精心铺设“铁轨”——制定严格且详尽的指令以最小化歧义与偏差。这种策略源于一种普遍认知：当前AI在意图推断、优先级判断与自主决策方面能力有限。在设想理想AI协作者时，用户期望获得介于当前AI与人类同事之间的混合形态：既能融合AI的高效性与大上下文窗口，又具备人类同事的批判性思维与自主决策能力。我们探讨了未来AI系统的设计启示，建议通过生成草稿实现目标对齐，借助端到端“测试运行”验证可行性，并采用智能检查点监控执行过程，最终将AI从被动的指令执行者转变为应对模糊性长周期问题的可靠协作者。

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

构建面向终端的 AI 编程智能体：脚手架、测试环境、上下文工程及实践经验

专知会员服务

26+阅读 · 3月8日

【大模型+搜索】AI搜索行业深度：大模型催生搜索行业变革机遇，产品百花齐放效果几何

专知会员服务

37+阅读 · 2024年4月17日