Humans can infer material characteristics of objects from their visual appearance, and this ability extends to artistic depictions, where similar perceptual strategies guide the interpretation of paintings or drawings. Among the factors that define material appearance, gloss, along with color, is widely regarded as one of the most important, and recent studies indicate that humans can perceive gloss independently of the artistic style used to depict an object. To investigate how gloss and artistic style are represented in learned models, we train an unsupervised generative model on a newly curated dataset of painterly objects designed to systematically vary such factors. Our analysis reveals a hierarchical latent space in which gloss is disentangled from other appearance factors, allowing for a detailed study of how gloss is represented and varies across artistic styles. Building on this representation, we introduce a lightweight adapter that connects our style- and gloss-aware latent space to a latent-diffusion model, enabling the synthesis of non-photorealistic images with fine-grained control of these factors. We compare our approach with previous models and observe improved disentanglement and controllability of the learned factors.
翻译:人类能够从物体的视觉外观推断其材质特性,这种能力同样适用于艺术描绘——在解读绘画或素描作品时,类似的感知策略指导着人们的理解。在定义材质外观的诸多因素中,光泽度与色彩被广泛视为最重要的因素之一;近期研究表明,人类能够独立于描绘物体所采用的艺术风格来感知光泽度。为探究光泽度与艺术风格在学习模型中的表征方式,我们在一个新构建的绘画性物体数据集上训练了无监督生成模型,该数据集专门针对此类因素进行系统性变化设计。分析表明,模型学习到的隐空间具有层次化结构,其中光泽度与其他外观因素实现解耦,这为深入研究光泽度在不同艺术风格中的表征与变化规律提供了可能。基于此表征,我们提出一种轻量级适配器,将我们具备风格与光泽感知能力的隐空间与隐扩散模型相连接,从而实现对非真实感图像的合成,并支持对这些因素的细粒度控制。通过与现有模型的对比实验,我们观察到所学因素在解耦性与可控性方面均获得提升。