Cocaine Use Prediction with Tensor-based Machine Learning on Multimodal MRI Connectome Data

This paper considers the use of machine learning algorithms for predicting cocaine use based on magnetic resonance imaging (MRI) connectomic data. The study utilized functional MRI (fMRI) and diffusion MRI (dMRI) data collected from 275 individuals, which was then parcellated into 246 regions of interest (ROIs) using the Brainnetome atlas. After data preprocessing, the datasets were transformed into tensor form. We developed a tensor-based unsupervised machine learning algorithm to reduce the size of the data tensor from $275$ (individuals) $\times 2$ (fMRI and dMRI) $\times 246$ (ROIs) $\times 246$ (ROIs) to $275$ (individuals) $\times 2$ (fMRI and dMRI) $\times 6$ (clusters) $\times 6$ (clusters). This was achieved by applying the high-order Lloyd algorithm to group the ROI data into 6 clusters. Features were extracted from the reduced tensor and combined with demographic features (age, gender, race, and HIV status). The resulting dataset was used to train a Catboost model using subsampling and nested cross-validation techniques, which achieved a prediction accuracy of 0.857 for identifying cocaine users. The model was also compared with other models, and the feature importance of the model was presented. Overall, this study highlights the potential for using tensor-based machine learning algorithms to predict cocaine use based on MRI connectomic data and presents a promising approach for identifying individuals at risk of substance abuse.

翻译：本文探讨了利用机器学习算法基于磁共振成像（MRI）连接组数据预测可卡因使用的研究。研究使用了从275名个体采集的功能性MRI（fMRI）和扩散MRI（dMRI）数据，并利用Brainnetome图谱将这些数据分割为246个感兴趣区域（ROI）。经过数据预处理后，数据集被转换为张量形式。我们开发了一种基于张量的无监督机器学习算法，将数据张量的维度从$275$（个体）$\times 2$（fMRI和dMRI）$\times 246$（ROI）$\times 246$（ROI）缩减至$275$（个体）$\times 2$（fMRI和dMRI）$\times 6$（聚类）$\times 6$（聚类）。这是通过应用高阶Lloyd算法将ROI数据归为6个聚类来实现的。特征从缩减后的张量中提取，并与人口统计特征（年龄、性别、种族和HIV状态）相结合。所得到的数据集用于训练Catboost模型，并采用子采样和嵌套交叉验证技术，该模型在识别可卡因使用者方面达到了0.857的预测准确率。此外，该模型与其他模型进行了对比，并展示了模型中特征的重要性。总体而言，本研究强调了使用基于张量的机器学习算法基于MRI连接组数据预测可卡因使用的潜力，并提出了一种识别物质滥用风险个体的有前景的方法。

相关内容

Machine Learning

关注 2251

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日