Fillerbuster: Multi-View Scene Completion for Casual Captures

We present Fillerbuster, a method that completes unknown regions of a 3D scene by utilizing a novel large-scale multi-view latent diffusion transformer. Casual captures are often sparse and miss surrounding content behind objects or above the scene. Existing methods are not suitable for handling this challenge as they focus on making the known pixels look good with sparse-view priors, or on creating the missing sides of objects from just one or two photos. In reality, we often have hundreds of input frames and want to complete areas that are missing and unobserved from the input frames. Additionally, the images often do not have known camera parameters. Our solution is to train a generative model that can consume a large context of input frames while generating unknown target views and recovering image poses when desired. We show results where we complete partial captures on two existing datasets. We also present an uncalibrated scene completion task where our unified model predicts both poses and creates new content. Our model is the first to predict many images and poses together for scene completion.

翻译：我们提出Fillerbuster方法，通过利用一种新颖的大规模多视角潜在扩散Transformer来完成三维场景的未知区域。随意拍摄的影像通常较为稀疏，会遗漏物体后方或场景上方的周边内容。现有方法不适用于应对这一挑战，因为它们主要侧重于利用稀疏视角先验使已知像素呈现良好效果，或仅从一两张照片中生成物体的缺失侧面。实际上，我们通常拥有数百张输入帧，并需要补全输入帧中缺失且未被观测到的区域。此外，这些图像通常不具备已知相机参数。我们的解决方案是训练一个生成模型，该模型能够处理大量输入帧的上下文信息，同时生成未知目标视角，并在需要时恢复图像位姿。我们在两个现有数据集上展示了补全局部拍摄内容的结果。我们还提出了一个未标定场景补全任务，其中我们的统一模型可同时预测位姿并创建新内容。我们的模型是首个能够联合预测多张图像与位姿以实现场景补全的模型。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日