面向城市语义分割的低成本训练数据生成框架 (A Framework for Low-Effort Training Data Generation for Urban Semantic Segmentation)

Synthetic datasets are widely used for training urban scene recognition models, but even highly realistic renderings show a noticeable gap to real imagery. This gap is particularly pronounced when adapting to a specific target domain, such as Cityscapes, where differences in architecture, vegetation, object appearance, and camera characteristics limit downstream performance. Closing this gap with more detailed 3D modelling would require expensive asset and scene design, defeating the purpose of low-cost labelled data. To address this, we present a new framework that adapts an off-the-shelf diffusion model to a target domain using only imperfect pseudo-labels. Once trained, it generates high-fidelity, target-aligned images from semantic maps of any synthetic dataset, including low-effort sources created in hours rather than months. The method filters suboptimal generations, rectifies image-label misalignments, and standardises semantics across datasets, transforming weak synthetic data into competitive real-domain training sets. Experiments on five synthetic datasets and two real target datasets show segmentation gains of up to +8.0%pt. mIoU over state-of-the-art translation methods, making rapidly constructed synthetic datasets as effective as high-effort, time-intensive synthetic datasets requiring extensive manual design. This work highlights a valuable collaborative paradigm where fast semantic prototyping, combined with generative models, enables scalable, high-quality training data creation for urban scene understanding.

翻译：合成数据集被广泛用于训练城市场景识别模型，但即使高度逼真的渲染图像与真实影像之间仍存在显著差距。这一差距在适应特定目标域（如Cityscapes）时尤为明显，其中建筑风格、植被类型、物体外观及相机特性等方面的差异限制了下游性能表现。若通过更精细的三维建模来弥合此差距，将需要昂贵的资产与场景设计成本，从而违背低成本标注数据的初衷。为此，我们提出一种新框架，该框架仅利用不完善的伪标签将现成的扩散模型适配至目标域。训练完成后，该框架可根据任意合成数据集的语义图生成高保真且与目标域对齐的图像，包括仅需数小时而非数月构建的低成本数据源。该方法通过筛选次优生成结果、校正图像与标签错位问题，并统一跨数据集的语义标准，将弱合成数据转化为具有竞争力的真实域训练集。在五个合成数据集和两个真实目标数据集上的实验表明，本方法相比最先进的转换方法在分割任务上实现了最高+8.0% mIoU的性能提升，使得快速构建的合成数据集能够达到需大量人工设计的高成本、长周期合成数据集的同等效能。本研究揭示了一种有价值的协同范式：快速语义原型设计与生成模型相结合，可为城市场景理解任务实现可扩展的高质量训练数据创建。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日