Modern AI assistants are trained to follow instructions, implicitly assuming that users can clearly articulate their goals and the kind of assistance they need. Decades of behavioral research, however, show that people often engage with AI systems before their goals are fully formed. When AI systems treat prompts as complete expressions of intent, they can appear to be useful or convenient, but not necessarily aligned with the users' needs. We call these failures Fantasia interactions. We argue that Fantasia interactions demand a rethinking of alignment research: rather than treating users as rational oracles, AI should provide cognitive support by actively helping users form and refine their intent through time. This requires an interdisciplinary approach that bridges machine learning, interface design, and behavioral science. We synthesize insights from these fields to characterize the mechanisms and failures of Fantasia interactions. We then show why existing interventions are insufficient, and propose a research agenda for designing and evaluating AI systems that better help humans navigate uncertainty in their tasks.
翻译:现代人工智能助手被训练用于遵循指令,其隐含假设是用户能够清晰表达其目标及所需帮助的类型。然而,数十年的行为研究表明,人们往往在目标尚未完全形成时就开始使用AI系统。当AI系统将提示视为意图的完整表达时,它们可能看似有用或便捷,但未必符合用户的实际需求。我们将此类失败称为"幻想曲交互"。我们认为,幻想曲交互要求重新思考对齐研究:AI不应将用户视为理性预言者,而应通过主动帮助用户随时间形成并完善其意图来提供认知支持。这需要一种连接机器学习、界面设计与行为科学的跨学科方法。我们综合这些领域的见解,以刻画幻想曲交互的机制与失败模式,进而论证现有干预措施的不足,并提出一项用于设计及评估能更好帮助人类应对任务不确定性的AI系统的研究议程。