Multimodal systems are vulnerable to partial or complete loss of input channels at deployment, which undermines reliability in real-world settings. This paper presents ModalImmune, a training framework that enforces modality immunity by intentionally and controllably collapsing selected modality information during training so the model learns joint representations that are robust to destructive modality influence. The framework combines a spectrum-adaptive collapse regularizer, an information-gain guided controller for targeted interventions, curvature-aware gradient masking to stabilize destructive updates, and a certified Neumann-truncated hyper-gradient procedure for automatic meta-parameter adaptation. Empirical evaluation on standard multimodal benchmarks demonstrates that ModalImmune improves resilience to modality removal and corruption while retaining convergence stability and reconstruction capacity.
翻译:多模态系统在部署时易受输入通道部分或完全丢失的影响,这削弱了其在真实场景中的可靠性。本文提出ModalImmune,一种通过训练中故意且可控地坍缩选定模态信息来强制实现模态免疫的训练框架,使模型学习到对破坏性模态影响具有鲁棒性的联合表征。该框架结合了谱自适应坍缩正则化器、用于定向干预的信息增益引导控制器、稳定破坏性更新的曲率感知梯度掩码,以及用于自动元参数调整的认证诺依曼截断超梯度过程。在标准多模态基准上的实证评估表明,ModalImmune在保持收敛稳定性和重建能力的同时,提升了对模态移除与损坏的抵御能力。