SeamEdit: A Black-Box VLM-Agnostic Pipeline for Large-Image Semantic Editing

Semantic region editing for large images must satisfy two requirements at the same time: high generative quality and natural integration with surrounding content. Some related methods rely on white-box models and leave the strong generation capability of closed-source models underexplored. Directly applying closed-source models to tiled editing, however, introduces several failure modes: semantic deformation, canvas-level alignment drift, and visible seam artifacts. This paper presents SeamEdit, a training-free and model-agnostic pipeline that treats any VLM with inpainting capability as a black-box oracle. SeamEdit mitigates these issues through a five-stage post-hoc pipeline: overlay-based tile decomposition, black-box VLM inpainting, geometric and color-consistency correction, seam-risk-based multi-candidate ranking, and dynamic-programming curved seam fusion. The pipeline reduces seam visibility and supports semantic modification of arbitrary tile regions.

翻译：针对大图像的语义区域编辑必须同时满足两个要求：高生成质量以及与周围内容的自然融合。现有方法部分依赖于白盒模型，尚未充分探索闭源模型的强大生成能力。然而，直接应用闭源模型进行分块编辑会引入多种失败模式：语义变形、画布级别对齐漂移以及可见拼接伪影。本文提出SeamEdit——一种无需训练且模型无关的流水线，它将任意具有图像修复能力的视觉语言模型视为黑盒预言机。SeamEdit通过五阶段后处理流水线缓解上述问题：基于覆盖的分块分解、黑盒VLM图像修复、几何与色彩一致性校正、基于拼接风险的多候选排序，以及动态规划曲率拼接融合。该流水线降低了拼接可见性，并支持对任意分块区域进行语义修改。

相关内容

黑盒

关注 1

在科学，计算和工程学中，黑盒是一种设备，系统或对象，可以根据其输入和输出（或传输特性）对其进行查看，而无需对其内部工作有任何了解。它的实现是“不透明的”（黑色）。几乎任何事物都可以被称为黑盒：晶体管，引擎，算法，人脑，机构或政府。为了使用典型的“黑匣子方法”来分析建模为开放系统的事物，仅考虑刺激/响应的行为，以推断（未知）盒子。该黑匣子系统的通常表示形式是在该方框中居中的数据流程图。黑盒的对立面是一个内部组件或逻辑可用于检查的系统，通常将其称为白盒（有时也称为“透明盒”或“玻璃盒”）。

ICML2026 | 重新思考顺序知识编辑中的正则化

专知会员服务

9+阅读 · 5月27日

【CVPR2025】基于组合表示移植的图像编辑方法

专知会员服务

8+阅读 · 2025年4月5日

CVPR 2024｜大视觉模型的开山之作！无需任何语言数据即可打造大视觉模型

专知会员服务

51+阅读 · 2024年5月6日

【ICCV2023教程】控制文本到图像扩散模型，40页slides（Hugging Face Sayak Paul）

专知会员服务

31+阅读 · 2023年10月4日