Deep neural networks have shown exemplary performance on semantic scene understanding tasks on source domains, but due to the absence of style diversity during training, enhancing performance on unseen target domains using only single source domain data remains a challenging task. Generation of simulated data is a feasible alternative to retrieving large style-diverse real-world datasets as it is a cumbersome and budget-intensive process. However, the large domain-specific inconsistencies between simulated and real-world data pose a significant generalization challenge in semantic segmentation. In this work, to alleviate this problem, we propose a novel MultiResolution Feature Perturbation (MRFP) technique to randomize domain-specific fine-grained features and perturb style of coarse features. Our experimental results on various urban-scene segmentation datasets clearly indicate that, along with the perturbation of style-information, perturbation of fine-feature components is paramount to learn domain invariant robust feature maps for semantic segmentation models. MRFP is a simple and computationally efficient, transferable module with no additional learnable parameters or objective functions, that helps state-of-the-art deep neural networks to learn robust domain invariant features for simulation-to-real semantic segmentation.
翻译:深度神经网络在源域上的语义场景理解任务中展现了卓越性能,但由于训练过程中缺乏风格多样性,仅使用单一源域数据提高模型在未见目标域上的性能仍是一项挑战。生成仿真数据是获取多样化风格真实世界数据集的一种可行替代方案,因为后者过程繁琐且预算密集。然而,仿真数据与真实数据之间巨大的域特定不一致性给语义分割的泛化带来了显著挑战。为解决此问题,本文提出一种新颖的多分辨率特征扰动(MRFP)技术,用于随机化域特定细粒度特征并扰动粗粒度特征的风格。我们在多个城市场景分割数据集上的实验结果表明,除了风格信息扰动外,细粒度特征分量的扰动对于学习语义分割模型的域不变鲁棒特征图至关重要。MRFP是一种简单、计算高效且可迁移的模块,无需额外可学习参数或目标函数,能够帮助最先进的深度神经网络学习鲁棒的域不变特征,用于仿真到真实的语义分割。