全景幻觉：基于局部场景扩散与概率提示的街景全景生成 (Hallucinating 360°: Panoramic Street-View Generation via Local Scenes Diffusion and Probabilistic Prompting)

Panoramic perception holds significant potential for autonomous driving, enabling vehicles to acquire a comprehensive 360° surround view in a single shot. However, autonomous driving is a data-driven task. Complete panoramic data acquisition requires complex sampling systems and annotation pipelines, which are time-consuming and labor-intensive. Although existing street view generation models have demonstrated strong data regeneration capabilities, they can only learn from the fixed data distribution of existing datasets and cannot leverage stitched pinhole images as a supervisory signal. In this paper, we propose the first panoramic generation method Percep360 for autonomous driving. Percep360 enables coherent generation of panoramic data with control signals based on the stitched panoramic data. Percep360 focuses on two key aspects: coherence and controllability. Specifically, to overcome the inherent information loss caused by the pinhole sampling process, we propose the Local Scenes Diffusion Method (LSDM). LSDM reformulates the panorama generation as a spatially continuous diffusion process, bridging the gaps between different data distributions. Additionally, to achieve the controllable generation of panoramic images, we propose a Probabilistic Prompting Method (PPM). PPM dynamically selects the most relevant control cues, enabling controllable panoramic image generation. We evaluate the effectiveness of the generated images from three perspectives: image quality assessment (i.e., no-reference and with reference), controllability, and their utility in real-world Bird's Eye View (BEV) segmentation. Notably, the generated data consistently outperforms the original stitched images in no-reference quality metrics and enhances downstream perception models. The source code will be publicly available at https://github.com/FeiT-FeiTeng/Percep360.

翻译：全景感知在自动驾驶领域具有重要潜力，使得车辆能够通过单次拍摄获取完整的360°环视信息。然而，自动驾驶是一项数据驱动的任务。完整的全景数据采集需要复杂的采样系统和标注流程，耗时且费力。尽管现有的街景生成模型已展现出强大的数据再生能力，但它们仅能从现有数据集的固定分布中学习，无法利用拼接的针孔图像作为监督信号。本文提出了首个面向自动驾驶的全景生成方法Percep360。该方法能够基于拼接的全景数据，通过控制信号实现连贯的全景数据生成。Percep360聚焦于两个关键方面：连贯性与可控性。具体而言，为克服针孔采样过程固有的信息损失，我们提出了局部场景扩散方法。该方法将全景生成重新定义为空间连续的扩散过程，弥合了不同数据分布之间的差距。此外，为实现全景图像的可控生成，我们提出了概率提示方法。该方法动态选择最相关的控制线索，从而实现可控的全景图像生成。我们从三个维度评估生成图像的有效性：图像质量评估（即无参考与有参考）、可控性及其在实际鸟瞰图分割任务中的实用性。值得注意的是，生成数据在无参考质量指标上持续优于原始拼接图像，并提升了下游感知模型的性能。源代码将在https://github.com/FeiT-FeiTeng/Percep360公开。