We present Large Language Model for Mixed Reality (LLMR), a framework for the real-time creation and modification of interactive Mixed Reality experiences using LLMs. LLMR leverages novel strategies to tackle difficult cases where ideal training data is scarce, or where the design goal requires the synthesis of internal dynamics, intuitive analysis, or advanced interactivity. Our framework relies on text interaction and the Unity game engine. By incorporating techniques for scene understanding, task planning, self-debugging, and memory management, LLMR outperforms the standard GPT-4 by 4x in average error rate. We demonstrate LLMR's cross-platform interoperability with several example worlds, and evaluate it on a variety of creation and modification tasks to show that it can produce and edit diverse objects, tools, and scenes. Finally, we conducted a usability study (N=11) with a diverse set that revealed participants had positive experiences with the system and would use it again.
翻译:我们提出了混合现实大语言模型(LLMR),这是一个利用大语言模型实时创建和修改交互式混合现实体验的框架。LLMR采用新颖策略应对理想训练数据稀缺、设计目标需合成内部动态、直觉分析或高级交互性等复杂场景。该框架基于文本交互与Unity游戏引擎,通过整合场景理解、任务规划、自我调试及内存管理技术,使LLMR在平均错误率上比标准GPT-4降低4倍。我们通过多个示例世界展示了LLMR的跨平台互操作性,并在多种创建与修改任务中评估其能力,证明它能生成并编辑多样化的物体、工具及场景。最后,我们进行了包含11名不同背景参与者的可用性研究,结果显示参与者对该系统体验积极并愿意再次使用。