Restoration Adaptation for Semantic Segmentation on Low Quality Images

In real-world scenarios, the performance of semantic segmentation often deteriorates when processing low-quality (LQ) images, which may lack clear semantic structures and high-frequency details. Although image restoration techniques offer a promising direction for enhancing degraded visual content, conventional real-world image restoration (Real-IR) models primarily focus on pixel-level fidelity and often fail to recover task-relevant semantic cues, limiting their effectiveness when directly applied to downstream vision tasks. Conversely, existing segmentation models trained on high-quality data lack robustness under real-world degradations. In this paper, we propose Restoration Adaptation for Semantic Segmentation (RASS), which effectively integrates semantic image restoration into the segmentation process, enabling high-quality semantic segmentation on the LQ images directly. Specifically, we first propose a Semantic-Constrained Restoration (SCR) model, which injects segmentation priors into the restoration model by aligning its cross-attention maps with segmentation masks, encouraging semantically faithful image reconstruction. Then, RASS transfers semantic restoration knowledge into segmentation through LoRA-based module merging and task-specific fine-tuning, thereby enhancing the model's robustness to LQ images. To validate the effectiveness of our framework, we construct a real-world LQ image segmentation dataset with high-quality annotations, and conduct extensive experiments on both synthetic and real-world LQ benchmarks. The results show that SCR and RASS significantly outperform state-of-the-art methods in segmentation and restoration tasks. Code, models, and datasets will be available at https://github.com/Ka1Guan/RASS.git.

翻译：在实际场景中，语义分割模型在处理低质量图像时性能往往显著下降，这类图像通常缺乏清晰的语义结构和高频细节。尽管图像复原技术为增强退化视觉内容提供了有前景的方向，但传统的真实世界图像复原模型主要关注像素级保真度，往往难以恢复任务相关的语义线索，这限制了其直接应用于下游视觉任务时的有效性。反之，现有基于高质量数据训练的分割模型在真实世界退化条件下缺乏鲁棒性。本文提出用于语义分割的复原适应方法，该方法将语义图像复原有效整合到分割流程中，从而能够直接在低质量图像上实现高质量的语义分割。具体而言，我们首先提出语义约束复原模型，该模型通过将复原模型的交叉注意力图与分割掩码对齐，将分割先验注入复原过程，从而促进语义保真的图像重建。随后，RASS通过基于LoRA的模块融合与任务特定微调，将语义复原知识迁移至分割模型，进而增强模型对低质量图像的鲁棒性。为验证本框架的有效性，我们构建了具有高质量标注的真实世界低质量图像分割数据集，并在合成与真实世界低质量基准上进行了广泛实验。结果表明，SCR与RASS在分割和复原任务上均显著优于现有最优方法。代码、模型及数据集将在https://github.com/Ka1Guan/RASS.git公开。