Recent advances in multimodal large language models (LLMs) have made it easier to rapidly prototype AI-powered features, especially for mobile use cases. However, gathering early, mobile-situated user feedback on these AI prototypes remains challenging. The broad scope and flexibility of LLMs means that, for a given use-case-specific prototype, there is a crucial need to understand the wide range of in-the-wild input users are likely to provide and their in-context expectations for the AI's behavior. To explore the concept of in situ AI prototyping and testing, we created MobileMaker: a platform that enables designers to rapidly create and test mobile AI prototypes directly on devices. This tool also enables testers to make on-device, in-the-field revisions of prototypes using natural language. In an exploratory study with 16 participants, we explored how user feedback on prototypes created with MobileMaker compares to that of existing prototyping tools (e.g., Figma, prompt editors). Our findings suggest that MobileMaker prototypes enabled more serendipitous discovery of: model input edge cases, discrepancies between AI's and user's in-context interpretation of the task, and contextual signals missed by the AI. Furthermore, we learned that while the ability to make in-the-wild revisions led users to feel more fulfilled as active participants in the design process, it might also constrain their feedback to the subset of changes perceived as more actionable or implementable by the prototyping tool.
翻译:近年来,多模态大语言模型(LLM)的进展使得快速原型化AI驱动功能变得更为便捷,尤其在移动应用场景中。然而,针对这些AI原型获取早期、移动场景下的用户反馈仍然具有挑战性。LLM的广泛适用性和灵活性意味着,对于特定用例的原型,亟需理解用户可能提供的多样化真实输入及其对AI行为的上下文预期。为探索原位AI原型设计与测试的概念,我们开发了MobileMaker:一个支持设计者在设备上快速创建并测试移动AI原型的平台。该工具还允许测试者使用自然语言在设备上对原型进行现场修订。在一项涉及16名参与者的探索性研究中,我们比较了MobileMaker创建的原型与现有原型工具(如Figma、提示编辑器)所获用户反馈的差异。研究结果表明,MobileMaker原型能够更意外地发现:模型输入的边界情况、AI与用户对任务上下文理解的差异,以及AI忽略的上下文信号。此外,我们发现,尽管现场修订能力使用户作为设计过程的积极参与者感到更满足,但它也可能将用户的反馈限制在那些被原型工具视为更可操作或可实现的变化子集内。