We present Large Language Model for Mixed Reality (LLMR), a framework for the real-time creation and modification of interactive Mixed Reality experiences using LLMs. LLMR leverages novel strategies to tackle difficult cases where ideal training data is scarce, or where the design goal requires the synthesis of internal dynamics, intuitive analysis, or advanced interactivity. Our framework relies on text interaction and the Unity game engine. By incorporating techniques for scene understanding, task planning, self-debugging, and memory management, LLMR outperforms the standard GPT-4 by 4x in average error rate. We demonstrate LLMR's cross-platform interoperability with several example worlds, and evaluate it on a variety of creation and modification tasks to show that it can produce and edit diverse objects, tools, and scenes. Finally, we conducted a usability study (N=11) with a diverse set that revealed participants had positive experiences with the system and would use it again.
翻译:我们提出面向混合现实的的大型语言模型框架(LLMR),用于通过大型语言模型实时创建与修改交互式混合现实体验。该框架针对理想训练数据稀缺、设计目标需融合内部动态机制、直观分析或高级交互性的复杂场景,采用创新策略予以突破。基于文本交互与Unity游戏引擎,通过整合场景理解、任务规划、自调试及内存管理技术,LLMR的平均错误率较标准GPT-4降低4倍。我们通过多个示例世界展示了LLMR的跨平台互操作性,并在各类创建与修改任务中验证其生成与编辑多样化对象、工具及场景的能力。最后,针对11名不同背景参与者的可用性研究表明,该系统获得了积极的使用体验并具有复用意愿。