Despite the popular of multimodal statistical models, there lacks rigorous statistical inference tools for inferring the significance of a single modality within a multimodal model, especially in high-dimensional models. For high-dimensional multimodal generalized linear models, we propose a novel entropy-based metric, called the expected relative entropy, to quantify the information gain of one modality in addition to all other modalities in the model. We propose a deviance-based statistic to estimate the expected relative entropy, prove that it is consistent and its asymptotic distribution can be approximated by a non-central chi-squared distribution. That enables the calculation of confidence intervals and p-values to assess the significance of the expected relative entropy for a given modality. We numerically evaluate the empirical performance of our proposed inference tool by simulations and apply it to a multimodal neuroimaging dataset to demonstrate its good performance on various high-dimensional multimodal generalized linear models.
翻译:尽管多模态统计模型日益普及,但目前仍缺乏严谨的统计推断工具来评估单一模态在多模态模型中的显著性,特别是在高维模型中。针对高维多模态广义线性模型,我们提出了一种基于信息熵的新度量指标——期望相对熵,用于量化在模型所有其他模态基础上增加某一模态所获得的信息增益。我们提出了一种基于偏差的统计量来估计期望相对熵,证明了该估计量具有一致性,且其渐近分布可由非中心卡方分布近似。这一性质使得我们能够计算置信区间和p值,从而评估给定模态期望相对熵的统计显著性。我们通过数值模拟评估了所提出推断工具的实证性能,并将其应用于多模态神经影像数据集,证明了该方法在各类高维多模态广义线性模型中均具有良好的表现。