Technological advances have enabled the generation of unique and complementary types of data or views (e.g. genomics, proteomics, metabolomics) and opened up a new era in multiview learning research with the potential to lead to new biomedical discoveries. We propose iDeepViewLearn (Interpretable Deep Learning Method for Multiview Learning) for learning nonlinear relationships in data from multiple views while achieving feature selection. iDeepViewLearn combines deep learning flexibility with the statistical benefits of data and knowledge-driven feature selection, giving interpretable results. Deep neural networks are used to learn view-independent low-dimensional embedding through an optimization problem that minimizes the difference between observed and reconstructed data, while imposing a regularization penalty on the reconstructed data. The normalized Laplacian of a graph is used to model bilateral relationships between variables in each view, therefore, encouraging selection of related variables. iDeepViewLearn is tested on simulated and two real-world data, including breast cancer-related gene expression and methylation data. iDeepViewLearn had competitive classification results and identified genes and CpG sites that differentiated between individuals who died from breast cancer and those who did not. The results of our real data application and simulations with small to moderate sample sizes suggest that iDeepViewLearn may be a useful method for small-sample-size problems compared to other deep learning methods for multiview learning.
翻译:技术进步使得生成独特且互补的数据类型或视图(例如基因组学、蛋白质组学、代谢组学)成为可能,并开启了多视图学习研究的新纪元,有望带来新的生物医学发现。我们提出iDeepViewLearn(多视图学习可解释深度学习方法),用于学习多视图数据中的非线性关系,同时实现特征选择。iDeepViewLearn结合了深度学习的灵活性与数据和知识驱动特征选择的统计优势,从而获得可解释的结果。通过一个优化问题,深度神经网络用于学习与视图无关的低维嵌入,该问题最小化观测数据与重构数据之间的差异,同时对重构数据施加正则化惩罚。利用图的归一化拉普拉斯矩阵对每个视图中变量之间的双边关系进行建模,从而促进相关变量的选择。iDeepViewLearn在模拟数据和两个真实世界数据(包括乳腺癌相关基因表达和甲基化数据)上进行了测试。iDeepViewLearn展现出具有竞争力的分类结果,并识别出能够区分乳腺癌死亡与非死亡个体的基因和CpG位点。我们的真实数据应用及中小样本量模拟结果表明,与其他多视图学习深度学习方法相比,iDeepViewLearn可能是一种适用于小样本量问题的有效方法。