Humans can infer material characteristics of objects from their visual appearance, and this ability extends to artistic depictions, where similar perceptual strategies guide the interpretation of paintings or drawings. Among the factors that define material appearance, gloss, along with color, is widely regarded as one of the most important, and recent studies indicate that humans can perceive gloss independently of the artistic style used to depict an object. To investigate how gloss and artistic style are represented in learned models, we train an unsupervised generative model on a newly curated dataset of painterly objects designed to systematically vary such factors. Our analysis reveals a hierarchical latent space in which gloss is disentangled from other appearance factors, allowing for a detailed study of how gloss is represented and varies across artistic styles. Building on this representation, we introduce a lightweight adapter that connects our style- and gloss-aware latent space to a latent-diffusion model, enabling the synthesis of non-photorealistic images with fine-grained control of these factors. We compare our approach with previous models and observe improved disentanglement and controllability of the learned factors.
翻译:人类能够从物体的视觉外观推断其材质特性,这种能力延伸至艺术描绘领域,相似的感知策略引导着对绘画或素描作品的解读。在决定材质外观的诸多因素中,光泽度与色彩被广泛视为最重要的因素之一,近期研究表明,人类能够独立于描绘物体所采用的艺术风格来感知光泽度。为探究光泽度与艺术风格在学习模型中的表征方式,我们在一个新构建的绘画对象数据集上训练了一个无监督生成模型,该数据集专为系统化调控上述因素而设计。我们的分析揭示了一个层次化的潜在空间,其中光泽度与其他外观因素实现解耦,从而允许对光泽度的表征方式及其在不同艺术风格间的变化进行细致研究。基于此表征,我们引入了一个轻量级适配器,将我们的风格与光泽感知潜在空间与一个潜在扩散模型相连接,实现了对非真实感图像的综合生成,并能够对这些因素进行细粒度控制。通过与先前模型的比较,我们观察到所学因素在解耦性与可控性方面均得到提升。