We offer a method for one-shot mask-guided image synthesis that allows controlling manipulations of a single image by inverting a quasi-robust classifier equipped with strong regularizers. Our proposed method, entitled MAGIC, leverages structured gradients from a pre-trained quasi-robust classifier to better preserve the input semantics while preserving its classification accuracy, thereby guaranteeing credibility in the synthesis. Unlike current methods that use complex primitives to supervise the process or use attention maps as a weak supervisory signal, MAGIC aggregates gradients over the input, driven by a guide binary mask that enforces a strong, spatial prior. MAGIC implements a series of manipulations with a single framework achieving shape and location control, intense non-rigid shape deformations, and copy/move operations in the presence of repeating objects and gives users firm control over the synthesis by requiring to simply specify binary guide masks. Our study and findings are supported by various qualitative comparisons with the state-of-the-art on the same images sampled from ImageNet and quantitative analysis using machine perception along with a user survey of 100+ participants that endorse our synthesis quality. Project page at https://mozhdehrouhsedaghat.github.io/magic.html. Code is available at https://github.com/mozhdehrouhsedaghat/magic
翻译:我们提出了一种用于一次性掩膜引导图像合成的方法,该方法通过反转配备强正则化器的准鲁棒分类器来控制单幅图像的编辑操作。所提出的方法名为MAGIC,利用预训练准鲁棒分类器的结构化梯度,在保持输入语义的同时维持其分类准确性,从而保证合成的可信度。与当前使用复杂基元监督过程或利用注意力图作为弱监督信号的方法不同,MAGIC通过引导式二值掩膜对输入上的梯度进行聚合,该掩膜施加了强空间先验。MAGIC通过单一框架实现一系列操作,包括形状与位置控制、强烈非刚性形状变形,以及在存在重复对象时的复制/移动操作,用户仅需指定简单的二值引导掩膜即可对合成过程实施牢固控制。我们的研究与发现得到了多方面的支持:与在ImageNet相同图像上采样的现有技术进行的定性比较,以及使用机器感知进行的定量分析,同时包含超过100名参与者的用户调查,结果均认可我们的合成质量。项目页面:https://mozhdehrouhsedaghat.github.io/magic.html。代码:https://github.com/mozhdehrouhsedaghat/magic