Recent deep models for image shadow removal often rely on attention-based architectures to capture long-range dependencies. However, their fixed attention patterns tend to mix illumination cues from irrelevant regions, leading to distorted structures and inconsistent colors. In this work, we revisit shadow removal from a sequence modeling perspective and explore the use of Mamba, a selective state space model that propagates global context through directional state transitions. These transitions yield an efficient global receptive field while preserving positional continuity. Despite its potential, directly applying Mamba to image data is suboptimal, since it lacks awareness of shadow-non-shadow semantics and remains susceptible to color interference from nearby regions. To address these limitations, we propose CrossGate, a directional modulation mechanism that injects shadow-aware similarity into Mamba's input gate, allowing selective integration of relevant context along transition axes. To further ensure appearance fidelity, we introduce ColorShift regularization, a contrastive learning objective driven by global color statistics. By synthesizing structured informative negatives, it guides the model to suppress color contamination and achieve robust color restoration. Together, these components adapt sequence modeling to the structural integrity and chromatic consistency required for shadow removal. Extensive experiments on public benchmarks demonstrate that DeshadowMamba achieves state-of-the-art visual quality and strong quantitative performance.
翻译:近期用于图像阴影去除的深度模型常依赖基于注意力的架构以捕获长程依赖关系。然而,其固定的注意力模式倾向于混合来自无关区域的照明线索,导致结构扭曲与色彩不一致。本研究从序列建模的视角重新审视阴影去除问题,探索使用Mamba——一种通过定向状态转移传播全局上下文的选择性状态空间模型。这些状态转移在保持位置连续性的同时,实现了高效的全局感受野。尽管潜力显著,直接将Mamba应用于图像数据存在局限,因其缺乏阴影-非阴影语义感知能力,且易受邻近区域色彩干扰。为突破这些限制,我们提出CrossGate——一种定向调制机制,将阴影感知相似性注入Mamba的输入门,从而沿转移轴选择性整合相关上下文。为进一步保障外观保真度,我们引入ColorShift正则化——一种由全局色彩统计驱动的对比学习目标。通过合成结构化的信息负样本,该机制引导模型抑制色彩污染,实现鲁棒的颜色复原。这些组件共同将序列建模适配于阴影去除所需的结构完整性与色彩一致性要求。在公开基准上的大量实验表明,DeshadowMamba在视觉质量与定量性能方面均达到最先进水平。