LIM：用于动态重建的大型插值模型 (LIM: Large Interpolator Model for Dynamic Reconstruction)

Reconstructing dynamic assets from video data is central to many in computer vision and graphics tasks. Existing 4D reconstruction approaches are limited by category-specific models or slow optimization-based methods. Inspired by the recent Large Reconstruction Model (LRM), we present the Large Interpolation Model (LIM), a transformer-based feed-forward solution, guided by a novel causal consistency loss, for interpolating implicit 3D representations across time. Given implicit 3D representations at times $t_0$ and $t_1$, LIM produces a deformed shape at any continuous time $t\in[t_0,t_1]$, delivering high-quality interpolated frames in seconds. Furthermore, LIM allows explicit mesh tracking across time, producing a consistently uv-textured mesh sequence ready for integration into existing production pipelines. We also use LIM, in conjunction with a diffusion-based multiview generator, to produce dynamic 4D reconstructions from monocular videos. We evaluate LIM on various dynamic datasets, benchmarking against image-space interpolation methods (e.g., FiLM) and direct triplane linear interpolation, and demonstrate clear advantages. In summary, LIM is the first feed-forward model capable of high-speed tracked 4D asset reconstruction across diverse categories.

翻译：从视频数据重建动态资产是计算机视觉与图形学诸多任务的核心。现有4D重建方法受限于特定类别模型或基于优化的缓慢方法。受近期大型重建模型（LRM）启发，我们提出大型插值模型（LIM）——一种基于Transformer的前馈解决方案，通过新颖的因果一致性损失指导，实现跨时间隐式3D表示的插值。给定时刻$t_0$和$t_1$的隐式3D表示，LIM可在任意连续时间$t\in[t_0,t_1]$生成形变后的形状，在数秒内提供高质量插值帧。此外，LIM支持跨时间的显式网格追踪，生成具有一致UV贴图的网格序列，可直接集成到现有生产管线中。我们还结合基于扩散的多视角生成器，利用LIM从单目视频生成动态4D重建结果。我们在多种动态数据集上评估LIM，与图像空间插值方法（如FiLM）及直接三平面线性插值进行对比，证明了其显著优势。总之，LIM是首个能够跨多样类别实现高速追踪式4D资产重建的前馈模型。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日