Semantic region editing for large images must satisfy two requirements at the same time: high generative quality and natural integration with surrounding content. Some related methods rely on white-box models and leave the strong generation capability of closed-source models underexplored. Directly applying closed-source models to tiled editing, however, introduces several failure modes: semantic deformation, canvas-level alignment drift, and visible seam artifacts. This paper presents SeamEdit, a training-free and model-agnostic pipeline that treats any VLM with inpainting capability as a black-box oracle. SeamEdit mitigates these issues through a five-stage post-hoc pipeline: overlay-based tile decomposition, black-box VLM inpainting, geometric and color-consistency correction, seam-risk-based multi-candidate ranking, and dynamic-programming curved seam fusion. The pipeline reduces seam visibility and supports semantic modification of arbitrary tile regions.
翻译:针对大图像的语义区域编辑必须同时满足两个要求:高生成质量以及与周围内容的自然融合。现有方法部分依赖于白盒模型,尚未充分探索闭源模型的强大生成能力。然而,直接应用闭源模型进行分块编辑会引入多种失败模式:语义变形、画布级别对齐漂移以及可见拼接伪影。本文提出SeamEdit——一种无需训练且模型无关的流水线,它将任意具有图像修复能力的视觉语言模型视为黑盒预言机。SeamEdit通过五阶段后处理流水线缓解上述问题:基于覆盖的分块分解、黑盒VLM图像修复、几何与色彩一致性校正、基于拼接风险的多候选排序,以及动态规划曲率拼接融合。该流水线降低了拼接可见性,并支持对任意分块区域进行语义修改。