Reliable learning of multimodal data (e.g., multi-omics) is a widely concerning issue, especially in safety-critical applications such as medical diagnosis. However, low-quality data induced by multimodal noise poses a major challenge in this domain, causing existing methods to suffer from two key limitations. First, they struggle to handle heterogeneous data noise, hindering robust multimodal representation learning. Second, they exhibit limited adaptability and generalization when encountering previously unseen noise. To address these issues, we propose Test-time Adaptive Hierarchical Co-enhanced Denoising Network (TAHCD). On one hand, TAHCD introduces the Adaptive Stable Subspace Alignment and Sample-Adaptive Confidence Alignment to reliably remove heterogeneous noise. They account for noise at both global and instance levels and enable jointly removal of modality-specific and cross-modality noise, achieving robust learning. On the other hand, TAHCD introduces Test-Time Cooperative Enhancement, which adaptively updates the model in response to input noise in a label-free manner, thus improving generalization. This is achieved by collaboratively enhancing the joint removal process of modality-specific and cross-modality noise across global and instance levels according to sample noise. Experiments on multiple benchmarks demonstrate that the proposed method achieves superior classification performance, robustness, and generalization compared with state-of-the-art reliable multimodal learning approaches.
翻译:多模态数据(如多组学数据)的可靠学习是一个广受关注的问题,尤其在医疗诊断等安全关键型应用中。然而,由多模态噪声引起的低质量数据给该领域带来了重大挑战,导致现有方法存在两个关键局限:其一,难以处理异构数据噪声,阻碍了鲁棒的多模态表示学习;其二,在遇到先前未见噪声时表现出有限的适应性与泛化能力。为解决这些问题,我们提出了测试时自适应分层协同增强去噪网络(TAHCD)。一方面,TAHCD引入自适应稳定子空间对齐与样本自适应置信度对齐机制,以可靠地去除异构噪声。这些机制同时考虑了全局与实例层面的噪声,能够联合去除模态特定噪声与跨模态噪声,从而实现鲁棒学习。另一方面,TAHCD提出测试时协同增强策略,该策略以无标签方式根据输入噪声自适应更新模型,从而提升泛化性能。这是通过依据样本噪声,在全局与实例层面协同增强模态特定噪声与跨模态噪声的联合去除过程实现的。在多个基准数据集上的实验表明,相较于当前最先进的可靠多模态学习方法,所提方法在分类性能、鲁棒性与泛化能力方面均表现出显著优势。