We present Paint Neural Stroke Field (PaintNeSF), a novel technique to generate stylized images of a 3D scene at arbitrary novel views from multi-view 2D images. Different from existing methods which apply stylization to trained neural radiance fields at the voxel level, our approach draws inspiration from image-to-painting methods, simulating the progressive painting process of human artwork with vector strokes. We develop a palette of stylized 3D strokes from basic primitives and splines, and consider the 3D scene stylization task as a multi-view reconstruction process based on these 3D stroke primitives. Instead of directly searching for the parameters of these 3D strokes, which would be too costly, we introduce a differentiable renderer that allows optimizing stroke parameters using gradient descent, and propose a training scheme to alleviate the vanishing gradient issue. The extensive evaluation demonstrates that our approach effectively synthesizes 3D scenes with significant geometric and aesthetic stylization while maintaining a consistent appearance across different views. Our method can be further integrated with style loss and image-text contrastive models to extend its applications, including color transfer and text-driven 3D scene drawing.
翻译:摘要:我们提出神经笔触场(PaintNeSF),一种从多视角2D图像生成任意新视角下3D场景风格化图像的新技术。不同于现有方法在体素层面对训练好的神经辐射场进行风格化,我们的方法借鉴了图像到绘画转换技术,通过矢量笔触模拟人类艺术创作的渐进绘画过程。我们基于基本几何图元和样条曲线构建风格化3D笔触调色板,将3D场景风格化任务视为基于这些3D笔触基元的多视角重建过程。为避免直接搜索3D笔触参数导致的高计算成本,我们引入可微分渲染器实现梯度下降优化笔触参数,并提出训练方案缓解梯度消失问题。大量实验表明,该方法能有效合成具有显著几何与美学风格化特征的3D场景,同时保持不同视角间视觉一致性。我们的方法还可与风格损失函数及图像-文本对比模型集成,拓展至色彩迁移、文本驱动的3D场景绘制等应用场景。