Tensor-based representations are being increasingly used to represent complex data types such as imaging data, due to their appealing properties such as dimension reduction and the preservation of spatial information. Recently, there is a growing literature on using Bayesian scalar-on-tensor regression techniques that use tensor-based representations for high-dimensional and spatially distributed covariates to predict continuous outcomes. However surprisingly, there is limited development on corresponding Bayesian classification methods relying on tensor-valued covariates. Standard approaches that vectorize the image are not desirable due to the loss of spatial structure, and alternate methods that use extracted features from the image in the predictive model may suffer from information loss. We propose a novel data augmentation-based Bayesian classification approach relying on tensor-valued covariates, with a focus on imaging predictors. We propose two data augmentation schemes, one resulting in a support vector machine (SVM) classifier, and another yielding a logistic regression classifier. While both types of classifiers have been proposed independently in literature, our contribution is to extend such existing methodology to accommodate high-dimensional tensor valued predictors that involve low rank decompositions of the coefficient matrix while preserving the spatial information in the image. An efficient Markov chain Monte Carlo (MCMC) algorithm is developed for implementing these methods. Simulation studies show significant improvements in classification accuracy and parameter estimation compared to routinely used classification methods. We further illustrate our method in a neuroimaging application using cortical thickness MRI data from Alzheimer's Disease Neuroimaging Initiative, with results displaying better classification accuracy throughout several classification tasks.
翻译:张量表示因其降维和保持空间信息等优良特性,正被越来越多地用于表示图像数据等复杂数据类型。近年来,利用张量表示处理高维空间分布协变量以预测连续结果的贝叶斯标量-张量回归技术文献日益增多。然而令人惊讶的是,针对张量值协变量的贝叶斯分类方法的发展却十分有限。将图像向量化的标准方法因破坏空间结构而不可取,而利用图像提取特征进行预测的替代方法又可能造成信息损失。我们提出了一种基于数据增强的贝叶斯分类新方法,该方法以张量值协变量为核心,重点关注图像型预测变量。我们设计了两种数据增强方案:一种生成支持向量机分类器,另一种生成逻辑回归分类器。虽然这两类分类器在文献中已有独立研究,但我们的贡献在于将现有方法扩展到能处理高维张量值预测变量,在保持图像空间信息的同时对系数矩阵进行低秩分解。我们开发了一种高效的马尔可夫链蒙特卡洛算法来实现这些方法。仿真研究表明,与常规分类方法相比,本方法在分类准确性和参数估计方面均有显著提升。我们进一步将该方法应用于阿尔茨海默病神经影像学倡议的皮质厚度MRI数据,结果显示在多个分类任务中均取得了更优的分类准确性。