This work introduces a new approach to automatic oil painting that emphasizes the creation of dynamic and expressive brushstrokes. A pivotal challenge lies in mitigating the duplicate and common-place strokes, which often lead to less aesthetic outcomes. Inspired by the human painting process, \ie, observing, comparing, and drawing, we incorporate differential image analysis into a neural oil painting model, allowing the model to effectively concentrate on the incremental impact of successive brushstrokes. To operationalize this concept, we propose the Differential Query Transformer (DQ-Transformer), a new architecture that leverages differentially derived image representations enriched with positional encoding to guide the stroke prediction process. This integration enables the model to maintain heightened sensitivity to local details, resulting in more refined and nuanced stroke generation. Furthermore, we incorporate adversarial training into our framework, enhancing the accuracy of stroke prediction and thereby improving the overall realism and fidelity of the synthesized paintings. Extensive qualitative evaluations, complemented by a controlled user study, validate that our DQ-Transformer surpasses existing methods in both visual realism and artistic authenticity, typically achieving these results with fewer strokes. The stroke-by-stroke painting animations are available on our project website.
翻译:本文提出了一种新的自动油画方法,重点在于生成动态且富有表现力的笔触。关键挑战在于减少重复和平庸的笔触——这类笔触常导致美学效果不佳。受人类绘画过程(即观察、比较与绘制)的启发,我们将差分图像分析融入神经油画模型,使模型能够有效聚焦于连续笔触的增量影响。为实现这一理念,我们提出了差分查询Transformer(DQ-Transformer),这是一种通过引入位置编码的差分图像表征来指导笔触预测过程的新型架构。该集成机制使模型保持对局部细节的高度敏感性,从而生成更精细、更微妙的笔触。此外,我们在框架中引入对抗训练,提升了笔触预测的准确性,进而增强了合成画作的逼真度和保真度。大量定性评估及受控用户研究验证了我们的DQ-Transformer在视觉真实性与艺术真实性方面均超越现有方法,且通常能以更少笔触实现上述效果。逐笔绘画动画可参见项目网站。