In recent years, instruction-based image editing methods have garnered significant attention in image editing. However, despite encompassing a wide range of editing priors, these methods are helpless when handling editing tasks that are challenging to accurately describe through language. We propose InstructBrush, an inversion method for instruction-based image editing methods to bridge this gap. It extracts editing effects from exemplar image pairs as editing instructions, which are further applied for image editing. Two key techniques are introduced into InstructBrush, Attention-based Instruction Optimization and Transformation-oriented Instruction Initialization, to address the limitations of the previous method in terms of inversion effects and instruction generalization. To explore the ability of instruction inversion methods to guide image editing in open scenarios, we establish a TransformationOriented Paired Benchmark (TOP-Bench), which contains a rich set of scenes and editing types. The creation of this benchmark paves the way for further exploration of instruction inversion. Quantitatively and qualitatively, our approach achieves superior performance in editing and is more semantically consistent with the target editing effects.
翻译:近年来,基于指令的图像编辑方法在图像编辑领域引起了广泛关注。然而,尽管这些方法涵盖了丰富的编辑先验知识,但在处理难以通过语言精确描述的编辑任务时仍显无力。我们提出InstructBrush,一种面向基于指令的图像编辑方法的反演技术,旨在弥补这一不足。该方法从示例图像对中提取编辑效果作为编辑指令,并将其进一步应用于图像编辑。针对先前方法在反演效果和指令泛化性方面的局限,我们在InstructBrush中引入了两项关键技术:基于注意力的指令优化与面向变换的指令初始化。为探究指令反演方法在开放场景中引导图像编辑的能力,我们构建了面向变换的配对基准(TOP-Bench),该基准包含丰富的场景与编辑类型。该基准的创建为指令反演的进一步探索奠定了基础。定性与定量实验表明,本方法在编辑中实现了更优性能,且与目标编辑效果在语义上具有更强的一致性。