We introduce WonderJourney, a modularized framework for perpetual 3D scene generation. Unlike prior work on view generation that focuses on a single type of scenes, we start at any user-provided location (by a text description or an image) and generate a journey through a long sequence of diverse yet coherently connected 3D scenes. We leverage an LLM to generate textual descriptions of the scenes in this journey, a text-driven point cloud generation pipeline to make a compelling and coherent sequence of 3D scenes, and a large VLM to verify the generated scenes. We show compelling, diverse visual results across various scene types and styles, forming imaginary "wonderjourneys". Project website: https://kovenyu.com/WonderJourney/
翻译:我们提出WonderJourney,一个用于持续三维场景生成的模块化框架。不同于以往专注于单一场景类型的视图生成工作,我们从用户提供的任意位置(通过文本描述或图像)出发,生成一段由多样化但连贯连接的3D场景组成的漫长旅程。我们利用大型语言模型生成这段旅程中场景的文本描述,采用文本驱动的点云生成流水线构建引人入胜且连贯的三维场景序列,并使用大型视觉语言模型验证生成的场景。我们展示了跨越多种场景类型与风格的丰富视觉结果,形成了虚构的“奇妙旅程”。项目网站:https://kovenyu.com/WonderJourney/