We present Large Language Model for Mixed Reality (LLMR), a framework for the real-time creation and modification of interactive Mixed Reality experiences using LLMs. LLMR leverages novel strategies to tackle difficult cases where ideal training data is scarce, or where the design goal requires the synthesis of internal dynamics, intuitive analysis, or advanced interactivity. Our framework relies on text interaction and the Unity game engine. By incorporating techniques for scene understanding, task planning, self-debugging, and memory management, LLMR outperforms the standard GPT-4 by 4x in average error rate. We demonstrate LLMR's cross-platform interoperability with several example worlds, and evaluate it on a variety of creation and modification tasks to show that it can produce and edit diverse objects, tools, and scenes. Finally, we conducted a usability study (N=11) with a diverse set that revealed participants had positive experiences with the system and would use it again.
翻译:我们提出面向混合现实的“大型语言模型”(LLMR)框架,该框架通过大语言模型实现对交互式混合现实体验的实时创建与修改。LLMR采用创新策略应对以下挑战:理想训练数据匮乏、或设计目标需合成内在动态机制、直觉分析及高级交互性。该框架基于文本交互与Unity游戏引擎,通过集成场景理解、任务规划、自我调试及记忆管理技术,使平均错误率较标准GPT-4提升4倍。我们通过多个示例世界展示了LLMR的跨平台互操作性,并在多种创建与修改任务中对其进行了评估,证明其能生成并编辑多样化的对象、工具及场景。最后,我们开展了一项包含11位不同背景参与者的可用性研究,结果表明参与者对系统体验积极且愿意再次使用。