We propose VL-DUN, a principled framework for joint All-in-One Medical Image Restoration and Segmentation (AiOMIRS) that bridges the gap between low-level signal recovery and high-level semantic understanding. While standard pipelines treat these tasks in isolation, our core insight is that they are fundamentally synergistic: restoration provides clean anatomical structures to improve segmentation, while semantic priors regularize the restoration process. VL-DUN resolves the sub-optimality of sequential processing through two primary innovations. (1) We formulate AiOMIRS as a unified optimization problem, deriving an interpretable joint unfolding mechanism where restoration and segmentation are mathematically coupled for mutual refinement. (2) We introduce a frequency-aware Mamba mechanism to capture long-range dependencies for global segmentation while preserving the high-frequency textures necessary for restoration. This allows for efficient global context modeling with linear complexity, effectively mitigating the spectral bias of standard architectures. As a pioneering work in the AiOMIRS task, VL-DUN establishes a new state-of-the-art across multi-modal benchmarks, improving PSNR by 0.92 dB and the Dice coefficient by 9.76\%. Our results demonstrate that joint collaborative learning offers a superior, more robust solution for complex clinical workflows compared to isolated task processing. The codes are provided in https://github.com/cipi666/VLDUN.
翻译:我们提出VL-DUN,一个用于联合"一体化医学图像恢复与分割"(AiOMIRS)的原则性框架,旨在弥合低层信号恢复与高层语义理解之间的鸿沟。传统流程通常孤立处理这些任务,而我们的核心见解在于它们本质上是协同的:恢复提供清晰的解剖结构以改进分割,而语义先验则对恢复过程进行正则化。VL-DUN通过两项主要创新解决了顺序处理的次优性问题。(1) 我们将AiOMIRS表述为统一的优化问题,推导出一个可解释的联合展开机制,其中恢复与分割在数学上相互耦合以实现迭代优化。(2) 我们引入频率感知Mamba机制,在捕获用于全局分割的长程依赖关系的同时,保留恢复所需的高频纹理特征。该机制能以线性复杂度实现高效的全局上下文建模,有效缓解了标准架构的频谱偏差。作为AiOMIRS任务的开创性工作,VL-DUN在多模态基准测试中确立了新的性能标杆,将PSNR提升0.92 dB,Dice系数提高9.76%。我们的结果表明,与孤立任务处理相比,联合协同学习为复杂临床工作流程提供了更优越、更鲁棒的解决方案。代码已发布于https://github.com/cipi666/VLDUN。