ImmerseGen: Agent-Guided Immersive World Generation with Alpha-Textured Proxies

Automating immersive VR scene creation remains a primary research challenge. Existing methods typically rely on complex geometry with post-simplification, resulting in inefficient pipelines or limited realism. In this paper, we introduce ImmerseGen, a novel agent-guided framework for compact and photorealistic world generation that decouples realism from exhaustive geometric modeling. ImmerseGen represents scenes as hierarchical compositions of lightweight geometric proxies with synthesized RGBA textures, facilitating real-time rendering on mobile VR headsets. We propose terrain-conditioned texturing for base world generation, combined with context-aware texturing for scenery, to produce diverse and visually coherent worlds. VLM-based agents employ semantic grid-based analysis for precise asset placement and enrich scenes with multimodal enhancements such as visual dynamics and ambient sound. Experiments and real-time VR applications demonstrate that ImmerseGen achieves superior photorealism, spatial coherence, and rendering efficiency compared to existing methods.

翻译：自动创建沉浸式VR场景仍是一项重要的研究挑战。现有方法通常依赖复杂几何结构并辅以后续简化，导致生成流程低效或逼真度有限。本文提出ImmerseGen，一种新颖的智能体引导框架，用于生成紧凑且逼真的世界，将真实感与详尽的几何建模解耦。ImmerseGen将场景表示为轻量化几何代理的层次组合，并辅以合成的RGBA纹理，从而支持在移动VR头显上的实时渲染。我们提出基于地形条件的纹理生成用于基础世界构建，结合上下文感知的纹理生成用于场景渲染，以生成多样且视觉一致的世界。基于视觉语言模型（VLM）的智能体采用语义网格分析进行精确的物体放置，并通过多模态增强（如视觉动态和环境音效）丰富场景。实验与实时VR应用表明，与现有方法相比，ImmerseGen在照片级真实感、空间一致性与渲染效率方面均表现更优。

相关内容

关注 23

IEEE虚拟现实会议一直是展示虚拟现实(VR)广泛领域研究成果的主要国际场所，包括增强现实（AR），混合现实（MR）和3D用户界面中寻求高质量的原创论文。每篇论文应归类为主要涵盖研究，应用程序或系统，并使用以下准则进行分类：研究论文应描述有助于先进软件，硬件，算法，交互或人为因素发展的结果。应用论文应解释作者如何基于现有思想并将其应用到以新颖的方式解决有趣的问题。每篇论文都应包括对给定应用领域中VR/AR/MR使用成功的评估。官网地址：http://dblp.uni-trier.de/db/conf/vr/

ICML 2026｜MEMOPILOT：用强化学习训练会进化的智能体记忆

专知会员服务

6+阅读 · 6月13日

《多智能体系统的神经协调：多领域任务环境中基于深度学习的智能体最优选择框架》

专知会员服务

27+阅读 · 2025年5月7日

【ETHZ博士论文】设计与分析：一种面向极大规模、高性能、模块化的智能体仿真平台

专知会员服务

31+阅读 · 2025年3月17日

DeepSeek R1方法成功迁移到视觉领域，多模态AI迎来新突破！

专知会员服务

25+阅读 · 2025年2月21日