This is a technical report on the 360-degree panoramic image generation task based on diffusion models. Unlike ordinary 2D images, 360-degree panoramic images capture the entire $360^\circ\times 180^\circ$ field of view. So the rightmost and the leftmost sides of the 360 panoramic image should be continued, which is the main challenge in this field. However, the current diffusion pipeline is not appropriate for generating such a seamless 360-degree panoramic image. To this end, we propose a circular blending strategy on both the denoising and VAE decoding stages to maintain the geometry continuity. Based on this, we present two models for \textbf{Text-to-360-panoramas} and \textbf{Single-Image-to-360-panoramas} tasks. The code has been released as an open-source project at \href{https://github.com/ArcherFMY/SD-T2I-360PanoImage}{https://github.com/ArcherFMY/SD-T2I-360PanoImage} and \href{https://www.modelscope.cn/models/damo/cv_diffusion_text-to-360panorama-image_generation/summary}{ModelScope}
翻译:本技术报告探讨基于扩散模型的360度全景图像生成任务。与普通二维图像不同,360度全景图像覆盖了完整的$360^\circ\times 180^\circ$视场范围,因此其最左端与最右端必须保持连续,这也是该领域面临的核心挑战。然而,现有的扩散管线难以直接生成此类无缝360度全景图像。为此,我们提出一种在去噪阶段和VAE解码阶段均适用的循环融合策略以维持几何连续性。基于此方法,我们分别面向\textbf{文本到360度全景}和\textbf{单图到360度全景}两个任务构建了模型。相关代码已作为开源项目发布在\href{https://github.com/ArcherFMY/SD-T2I-360PanoImage}{https://github.com/ArcherFMY/SD-T2I-360PanoImage}和\href{https://www.modelscope.cn/models/damo/cv_diffusion_text-to-360panorama-image_generation/summary}{ModelScope}。