The duality of content and style is inherent to the nature of art. For humans, these two elements are clearly different: content refers to the objects and concepts in the piece of art, and style to the way it is expressed. This duality poses an important challenge for computer vision. The visual appearance of objects and concepts is modulated by the style that may reflect the author's emotions, social trends, artistic movement, etc., and their deep comprehension undoubtfully requires to handle both. A promising step towards a general paradigm for art analysis is to disentangle content and style, whereas relying on human annotations to cull a single aspect of artworks has limitations in learning semantic concepts and the visual appearance of paintings. We thus present GOYA, a method that distills the artistic knowledge captured in a recent generative model to disentangle content and style. Experiments show that synthetically generated images sufficiently serve as a proxy of the real distribution of artworks, allowing GOYA to separately represent the two elements of art while keeping more information than existing methods.
翻译:内容与风格的双重性内在于艺术的本质。对人类而言,这两个元素明显不同:内容指艺术品中的对象与概念,风格则指其表达方式。这种双重性对计算机视觉提出了重要挑战。对象的视觉外观受风格调制,风格可能反映作者情感、社会趋势、艺术流派等,而深度理解无疑需要同时处理这两方面。迈向艺术分析通用范式的一个有前景的步骤是解耦内容与风格,然而依赖人工标注去筛选艺术品的单一侧面,在理解语义概念及绘画视觉外观方面存在局限。为此,我们提出GOYA方法,它从近期生成模型中提取艺术知识,实现内容与风格的解耦。实验表明,合成生成的图像足以作为艺术品真实分布的代理,使得GOYA能够独立表征艺术的两个元素,同时比现有方法保留更多信息。