In healthcare, the integration of multimodal data is pivotal for developing comprehensive diagnostic and predictive models. However, managing missing data remains a significant challenge in real-world applications. We introduce MARIA (Multimodal Attention Resilient to Incomplete datA), a novel transformer-based deep learning model designed to address these challenges through an intermediate fusion strategy. Unlike conventional approaches that depend on imputation, MARIA utilizes a masked self-attention mechanism, which processes only the available data without generating synthetic values. This approach enables it to effectively handle incomplete datasets, enhancing robustness and minimizing biases introduced by imputation methods. We evaluated MARIA against 10 state-of-the-art machine learning and deep learning models across 8 diagnostic and prognostic tasks. The results demonstrate that MARIA outperforms existing methods in terms of performance and resilience to varying levels of data incompleteness, underscoring its potential for critical healthcare applications.
翻译:在医疗健康领域,多模态数据的融合对于构建全面的诊断与预测模型至关重要。然而,在实际应用中,处理缺失数据仍然是一个重大挑战。本文提出MARIA(面向不完整数据的多模态注意力模型),这是一种基于Transformer的新型深度学习模型,旨在通过中间融合策略应对这些挑战。与依赖插补的传统方法不同,MARIA采用掩码自注意力机制,仅处理可用数据而不生成合成值。这一方法使其能够有效处理不完整数据集,增强模型的鲁棒性,并最大程度减少插补方法引入的偏差。我们在8项诊断与预后任务中,将MARIA与10种先进的机器学习和深度学习模型进行了比较评估。结果表明,MARIA在性能及对不同数据不完整程度的适应能力方面均优于现有方法,凸显了其在关键医疗应用中的潜力。