Developing a Novel Holistic, Personalized Dementia Risk Prediction Model via Integration of Machine Learning and Network Systems Biology Approaches

The prevalence of dementia has increased over time as global life expectancy improves and populations age. An individual's risk of developing dementia is influenced by various genetic, lifestyle, and environmental factors, among others. Predicting dementia risk may enable individuals to employ mitigation strategies or lifestyle changes to delay dementia onset. Current computational approaches to dementia prediction only return risk upon narrow categories of variables and do not account for interactions between different risk variables. The proposed framework utilizes a novel holistic approach to dementia risk prediction and is the first to incorporate various sources of tabular environmental pollution and lifestyle factor data with network systems biology-based genetic data. LightGBM gradient boosting was employed to ensure validity of included factors. This approach successfully models interactions between variables through an original weighted integration method coined Sysable. Multiple machine learning models trained the algorithm to reduce reliance on a single model. The developed approach surpassed all existing dementia risk prediction approaches, with a sensitivity of 85%, specificity of 99%, geometric accuracy of 92%, and AUROC of 91.7%. A transfer learning model was implemented as well. De-biasing algorithms were run on the model via the AI Fairness 360 Library. Effects of demographic disparities on dementia prevalence were analyzed to potentially highlight areas in need and promote equitable and accessible care. The resulting model was additionally integrated into a user-friendly app providing holistic predictions and personalized risk mitigation strategies. The developed model successfully employs holistic computational dementia risk prediction for clinical use.

翻译：随着全球预期寿命延长和人口老龄化，痴呆症患病率随时间持续上升。个体罹患痴呆症的风险受遗传、生活方式和环境等多重因素影响。预测痴呆风险可使人们采取干预策略或调整生活方式以延缓疾病发作。当前基于计算方法的痴呆预测仅针对狭窄变量类别返回风险值，且未考虑不同风险变量间的相互作用。本研究提出的框架采用新型整体性痴呆风险预测方法，首次将各种表格化的环境污染因素与生活方式数据，与基于网络系统生物学的遗传数据相整合。采用LightGBM梯度提升算法确保纳入因素的有效性。通过原创的加权整合方法（命名为Sysable）成功建模变量间的相互作用。多个机器学习模型协同训练算法以降低对单一模型的依赖。本方法超越现有所有痴呆风险预测方法，灵敏度达85%，特异性达99%，几何准确率达92%，AUROC达91.7%。同时实施了迁移学习模型。通过AI公平性360库对模型执行去偏算法，分析人口统计学差异对痴呆患病率的影响，以潜在揭示需重点关注的领域并促进公平可及的治疗。最终将模型整合至用户友好型应用程序中，提供整体预测与个性化风险缓解策略。本模型成功实现了面向临床使用的整体计算性痴呆风险预测。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日