In this paper, we explore the design space of procedural rules for multi-view stereo (MVS). We demonstrate that we can generate effective training data using SimpleProc: a new, fully procedural generator driven by a very small set of rules using Non-Uniform Rational Basis Splines (NURBS), as well as basic displacement and texture patterns. At a modest scale of 8,000 images, our approach achieves superior results compared to manually curated images (at the same scale) sourced from games and real-world objects. When scaled to 352,000 images, our method yields performance comparable to--and in several benchmarks, exceeding--models trained on over 692,000 manually curated images. The source code and the data are available at https://github.com/princeton-vl/SimpleProc.
翻译:本文探索了多视图立体视觉(MVS)程序化规则的设计空间。我们证明,使用SimpleProc(一种全新的、完全程序化的生成器,仅通过非均匀有理B样条(NURBS)以及基础位移和纹理模式等极少量规则驱动)即可生成有效的训练数据。在8000张图像的适度规模下,我们的方法相较于源自游戏和真实世界物体的人工标注图像(同规模数据)取得了更优结果。当扩展至352,000张图像时,本方法展现出的性能可媲美甚至在某些基准测试中超越基于超过692,000张人工标注图像训练的模型。源代码及数据已开源至https://github.com/princeton-vl/SimpleProc。