AttentionPainter: An Efficient and Adaptive Stroke Predictor for Scene Painting

Stroke-based Rendering (SBR) aims to decompose an input image into a sequence of parameterized strokes, which can be rendered into a painting that resembles the input image. Recently, Neural Painting methods that utilize deep learning and reinforcement learning models to predict the stroke sequences have been developed, but suffer from longer inference time or unstable training. To address these issues, we propose AttentionPainter, an efficient and adaptive model for single-step neural painting. First, we propose a novel scalable stroke predictor, which predicts a large number of stroke parameters within a single forward process, instead of the iterative prediction of previous Reinforcement Learning or auto-regressive methods, which makes AttentionPainter faster than previous neural painting methods. To further increase the training efficiency, we propose a Fast Stroke Stacking algorithm, which brings 13 times acceleration for training. Moreover, we propose Stroke-density Loss, which encourages the model to use small strokes for detailed information, to help improve the reconstruction quality. Finally, we propose a new stroke diffusion model for both conditional and unconditional stroke-based generation, which denoises in the stroke parameter space and facilitates stroke-based inpainting and editing applications helpful for human artists design. Extensive experiments show that AttentionPainter outperforms the state-of-the-art neural painting methods.

翻译：基于笔触的渲染旨在将输入图像分解为一系列参数化笔触，这些笔触可被渲染成与输入图像相似的绘画作品。近期，利用深度学习和强化学习模型预测笔触序列的神经绘画方法已被提出，但存在推理时间较长或训练不稳定的问题。为解决这些问题，我们提出AttentionPainter，一种高效的单步神经绘画自适应模型。首先，我们提出一种新颖的可扩展笔触预测器，其在单次前向过程中预测大量笔触参数，而非如先前强化学习或自回归方法那样进行迭代预测，这使得AttentionPainter比以往的神经绘画方法更为快速。为进一步提升训练效率，我们提出快速笔触堆叠算法，实现了13倍的训练加速。此外，我们提出笔触密度损失函数，通过鼓励模型使用小笔触刻画细节信息，以提升重建质量。最后，我们提出一种新的笔触扩散模型，适用于条件与非条件的基于笔触的生成任务，该模型在笔触参数空间中进行去噪，有助于实现基于笔触的图像修复与编辑应用，为艺术家的创作提供支持。大量实验表明，AttentionPainter在性能上超越了当前最先进的神经绘画方法。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日