Multimodal Clinical Trial Outcome Prediction with Large Language Models

The clinical trial is a pivotal and costly process, often spanning multiple years and requiring substantial financial resources. Therefore, the development of clinical trial outcome prediction models aims to exclude drugs likely to fail and holds the potential for significant cost savings. Recent data-driven attempts leverage deep learning methods to integrate multimodal data for predicting clinical trial outcomes. However, these approaches rely on manually designed modal-specific encoders, which limits both the extensibility to adapt new modalities and the ability to discern similar information patterns across different modalities. To address these issues, we propose a multimodal mixture-of-experts (LIFTED) approach for clinical trial outcome prediction. Specifically, LIFTED unifies different modality data by transforming them into natural language descriptions. Then, LIFTED constructs unified noise-resilient encoders to extract information from modal-specific language descriptions. Subsequently, a sparse Mixture-of-Experts framework is employed to further refine the representations, enabling LIFTED to identify similar information patterns across different modalities and extract more consistent representations from those patterns using the same expert model. Finally, a mixture-of-experts module is further employed to dynamically integrate different modality representations for prediction, which gives LIFTED the ability to automatically weigh different modalities and pay more attention to critical information. The experiments demonstrate that LIFTED significantly enhances performance in predicting clinical trial outcomes across all three phases compared to the best baseline, showcasing the effectiveness of our proposed key components.

翻译：临床试验是一个关键且成本高昂的过程，通常需要跨越多年并投入大量资金。因此，开发临床试验结果预测模型旨在排除可能失败的药物，并具有显著节约成本的潜力。近期基于数据驱动的尝试借助深度学习方法，整合多模态数据以预测临床试验结果。然而，这些方法依赖于人工设计的模态特定编码器，既限制了扩展至新模态的能力，也限制了识别不同模态间相似信息模式的能力。为解决这些问题，我们提出了一种多模态混合专家（LIFTED）方法用于临床试验结果预测。具体而言，LIFTED通过将不同模态数据转化为自然语言描述来统一它们。随后，LIFTED构建统一的噪声鲁棒编码器，从模态特定的语言描述中提取信息。接着，采用稀疏混合专家框架进一步优化表示，使LIFTED能够识别不同模态间的相似信息模式，并利用同一专家模型从这些模式中提取更一致的表示。最后，进一步采用混合专家模块动态整合不同模态的表示以进行预测，使LIFTED能够自动权衡不同模态，并更关注关键信息。实验表明，与最佳基线相比，LIFTED在所有三阶段临床试验结果预测中均显著提升了性能，展示了所提出关键组件的有效性。