FastCAD: Real-Time CAD Retrieval and Alignment from Scans and Videos

Digitising the 3D world into a clean, CAD model-based representation has important applications for augmented reality and robotics. Current state-of-the-art methods are computationally intensive as they individually encode each detected object and optimise CAD alignments in a second stage. In this work, we propose FastCAD, a real-time method that simultaneously retrieves and aligns CAD models for all objects in a given scene. In contrast to previous works, we directly predict alignment parameters and shape embeddings. We achieve high-quality shape retrievals by learning CAD embeddings in a contrastive learning framework and distilling those into FastCAD. Our single-stage method accelerates the inference time by a factor of 50 compared to other methods operating on RGB-D scans while outperforming them on the challenging Scan2CAD alignment benchmark. Further, our approach collaborates seamlessly with online 3D reconstruction techniques. This enables the real-time generation of precise CAD model-based reconstructions from videos at 10 FPS. Doing so, we significantly improve the Scan2CAD alignment accuracy in the video setting from 43.0% to 48.2% and the reconstruction accuracy from 22.9% to 29.6%.

翻译：将三维世界数字化为干净的CAD模型表示，对增强现实和机器人技术具有重要应用。当前最先进方法计算开销较大，因为它们需单独编码每个检测到的对象，并在第二阶段优化CAD对齐。本文提出FastCAD——一种实时方法，可同时检索并对齐给定场景中所有物体的CAD模型。与先前工作不同，我们直接预测对齐参数和形状嵌入。通过对比学习框架学习CAD嵌入，并将其蒸馏至FastCAD中，我们实现了高质量形状检索。与基于RGB-D扫描的其他方法相比，我们的单阶段方法推理速度提升50倍，同时在挑战性Scan2CAD对齐基准上表现更优。此外，我们的方法可无缝协同在线三维重建技术，从视频中以10 FPS实时生成精确的CAD模型重建。由此，在视频场景下，我们将Scan2CAD对齐准确率从43.0%显著提升至48.2%，重建准确率从22.9%提升至29.6%。

相关内容

CAD

关注 3

《计算机辅助设计》是一份领先的国际期刊，为学术界和工业界提供有关计算机应用于设计的研究和发展的重要论文。计算机辅助设计邀请论文报告新的研究以及新颖或特别重要的应用，在广泛的主题中，跨越所有阶段的设计过程，从概念创造到制造超越。官网地址：http://dblp.uni-trier.de/db/journals/cad/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日