The process of painting fosters creativity and rational planning. However, existing generative AI mostly focuses on producing visually pleasant artworks, without emphasizing the painting process. We introduce a novel task, Collaborative Neural Painting (CNP), to facilitate collaborative art painting generation between humans and machines. Given any number of user-input brushstrokes as the context or just the desired object class, CNP should produce a sequence of strokes supporting the completion of a coherent painting. Importantly, the process can be gradual and iterative, so allowing users' modifications at any phase until the completion. Moreover, we propose to solve this task using a painting representation based on a sequence of parametrized strokes, which makes it easy both editing and composition operations. These parametrized strokes are processed by a Transformer-based architecture with a novel attention mechanism to model the relationship between the input strokes and the strokes to complete. We also propose a new masking scheme to reflect the interactive nature of CNP and adopt diffusion models as the basic learning process for its effectiveness and diversity in the generative field. Finally, to develop and validate methods on the novel task, we introduce a new dataset of painted objects and an evaluation protocol to benchmark CNP both quantitatively and qualitatively. We demonstrate the effectiveness of our approach and the potential of the CNP task as a promising avenue for future research.
翻译:绘画过程培养创造力与理性规划。然而,现有生成式人工智能主要致力于生成视觉愉悦的艺术作品,而未强调绘画过程本身。我们提出一项新任务——协作神经绘画(CNP),旨在促进人类与机器之间的协作式艺术绘画生成。给定任意数量的用户输入笔触作为上下文或仅给定目标物体类别,CNP应生成一系列笔触以支持连贯绘画的完成。重要的是,该过程可逐步迭代进行,允许用户在任意阶段进行修改直至最终完成。此外,我们提出基于参数化笔触序列的绘画表示来解决该任务,该表示易于进行编辑与组合操作。这些参数化笔触通过基于Transformer的架构进行处理,并采用一种新型注意力机制建模输入笔触与待完成笔触之间的关系。我们还提出一种新的掩码方案以反映CNP的交互特性,并采用扩散模型作为基础学习过程,因其在生成领域的有效性与多样性。最后,为开发并验证该新任务的方法,我们引入一个绘物体数据集与评估协议,对CNP进行定量与定性基准测试。我们证明了方法的有效性以及CNP任务作为未来研究方向的潜力。