Human cancers present a significant public health challenge and require the discovery of novel drugs through translational research. Transcriptomics profiling data that describes molecular activities in tumors and cancer cell lines are widely utilized for predicting anti-cancer drug responses. However, existing AI models face challenges due to noise in transcriptomics data and lack of biological interpretability. To overcome these limitations, we introduce VETE (Variational and Explanatory Transcriptomics Encoder), a novel neural network framework that incorporates a variational component to mitigate noise effects and integrates traceable gene ontology into the neural network architecture for encoding cancer transcriptomics data. Key innovations include a local interpretability-guided method for identifying ontology paths, a visualization tool to elucidate biological mechanisms of drug responses, and the application of centralized large scale hyperparameter optimization. VETE demonstrated robust accuracy in cancer cell line classification and drug response prediction. Additionally, it provided traceable biological explanations for both tasks and offers insights into the mechanisms underlying its predictions. VETE bridges the gap between AI-driven predictions and biologically meaningful insights in cancer research, which represents a promising advancement in the field.
翻译:人类癌症构成重大公共卫生挑战,需要通过转化研究发现新型药物。描述肿瘤和癌细胞系分子活动的转录组学分析数据被广泛用于预测抗癌药物反应。然而,现有人工智能模型面临转录组学数据噪声和缺乏生物学可解释性的挑战。为克服这些限制,我们提出VETE(变分与可解释转录组学编码器),这是一种新型神经网络框架,其整合了变分组件以减轻噪声影响,并将可追溯的基因本体论融入神经网络架构以编码癌症转录组学数据。关键创新包括:用于识别本体路径的局部可解释性引导方法、阐明药物反应生物学机制的可视化工具,以及集中式大规模超参数优化的应用。VETE在癌细胞系分类和药物反应预测中表现出稳健的准确性。此外,该模型为两项任务提供了可追溯的生物学解释,并对其预测机制提供了深入见解。VETE弥合了癌症研究中人工智能驱动预测与具有生物学意义的见解之间的鸿沟,代表了该领域一项具有前景的进展。