In this paper, we delve into the concept of interpretable image enhancement, a technique that enhances image quality by adjusting filter parameters with easily understandable names such as "Exposure" and "Contrast". Unlike using predefined image editing filters, our framework utilizes learnable filters that acquire interpretable names through training. Our contribution is two-fold. Firstly, we introduce a novel filter architecture called an image-adaptive neural implicit lookup table, which uses a multilayer perceptron to implicitly define the transformation from input feature space to output color space. By incorporating image-adaptive parameters directly into the input features, we achieve highly expressive filters. Secondly, we introduce a prompt guidance loss to assign interpretable names to each filter. We evaluate visual impressions of enhancement results, such as exposure and contrast, using a vision and language model along with guiding prompts. We define a constraint to ensure that each filter affects only the targeted visual impression without influencing other attributes, which allows us to obtain the desired filter effects. Experimental results show that our method outperforms existing predefined filter-based methods, thanks to the filters optimized to predict target results. Our source code is available at https://github.com/satoshi-kosugi/PG-IA-NILUT.
翻译:本文深入探讨可解释图像增强技术,该技术通过调整具有直观名称(如"曝光度"和"对比度")的滤波器参数来提升图像质量。与使用预定义图像编辑滤波器不同,我们的框架采用可学习滤波器,这些滤波器通过训练获得可解释的名称。我们的贡献主要体现在两个方面。首先,我们提出一种称为图像自适应神经隐式查找表的新型滤波器架构,该架构使用多层感知机隐式定义从输入特征空间到输出色彩空间的映射关系。通过将图像自适应参数直接融入输入特征,我们实现了高表现力的滤波器。其次,我们引入提示引导损失函数为每个滤波器分配可解释名称。我们借助视觉语言模型和引导提示,对增强结果(如曝光度和对比度)的视觉感知效果进行评估。通过设定约束条件确保每个滤波器仅影响目标视觉感知属性而不干扰其他特征,从而获得预期的滤波效果。实验结果表明,得益于针对目标结果优化的滤波器设计,我们的方法在性能上超越了现有基于预定义滤波器的方法。源代码已发布于 https://github.com/satoshi-kosugi/PG-IA-NILUT。