StratMed: Relevance Stratification between Biomedical Entities for Sparsity on Medication Recommendation

With the growing imbalance between limited medical resources and escalating demands, AI-based clinical tasks have become paramount. As a sub-domain, medication recommendation aims to amalgamate longitudinal patient history with medical knowledge, assisting physicians in prescribing safer and more accurate medication combinations. Existing works ignore the inherent long-tailed distribution of medical data, have uneven learning strengths for hot and sparse data, and fail to balance safety and accuracy. To address the above limitations, we propose StratMed, which introduces a stratification strategy that overcomes the long-tailed problem and achieves fuller learning of sparse data. It also utilizes a dual-property network to address the issue of mutual constraints on the safety and accuracy of medication combinations, synergistically enhancing these two properties. Specifically, we construct a pre-training method using deep learning networks to obtain medication and disease representations. After that, we design a pyramid-like stratification method based on relevance to strengthen the expressiveness of sparse data. Based on this relevance, we design two graph structures to express medication safety and precision at the same level to obtain patient representations. Finally, the patient's historical clinical information is fitted to generate medication combinations for the current health condition. We employed the MIMIC-III dataset to evaluate our model against state-of-the-art methods in three aspects comprehensively. Compared to the sub-optimal baseline model, our model reduces safety risk by 15.08\%, improves accuracy by 0.36\%, and reduces training time consumption by 81.66\%.

翻译：随着有限的医疗资源与日益增长的需求之间的失衡加剧，基于人工智能的临床任务变得至关重要。作为子领域，药物推荐旨在融合纵向患者病史与医学知识，协助医生开具更安全、更准确的药物组合。现有研究忽视了医疗数据固有的长尾分布，对热门数据和稀疏数据的学习强度不均，且未能平衡安全性与准确性。为克服上述局限，我们提出StratMed，引入一种分层策略，该策略解决了长尾问题并实现对稀疏数据的更充分学习。它还利用双属性网络解决药物组合安全性和准确性相互制约的问题，协同增强这两个属性。具体而言，我们利用深度学习网络构建预训练方法以获取药物和疾病表示。随后，我们设计了一种基于相关性的金字塔式分层方法，以增强稀疏数据的表达能力。基于这种相关性，我们设计了两种图结构，在同一层次上表达药物安全性和精确性，从而获取患者表示。最后，拟合患者的既往临床信息，以生成针对当前健康状况的药物组合。我们采用MIMIC-III数据集，从三个方面将我们的模型与最先进方法进行全面评估。与次优基线模型相比，我们的模型将安全风险降低了15.08%，准确性提高了0.36%，训练时间消耗减少了81.66%。