We consider the problem of jointly modeling and clustering populations of tensors by introducing a high-dimensional tensor mixture model with heterogeneous covariances. To effectively tackle the high dimensionality of tensor objects, we employ plausible dimension reduction assumptions that exploit the intrinsic structures of tensors such as low-rankness in the mean and separability in the covariance. In estimation, we develop an efficient high-dimensional expectation-conditional-maximization (HECM) algorithm that breaks the intractable optimization in the M-step into a sequence of much simpler conditional optimization problems, each of which is convex, admits regularization and has closed-form updating formulas. Our theoretical analysis is challenged by both the non-convexity in the EM-type estimation and having access to only the solutions of conditional maximizations in the M-step, leading to the notion of dual non-convexity. We demonstrate that the proposed HECM algorithm, with an appropriate initialization, converges geometrically to a neighborhood that is within statistical precision of the true parameter. The efficacy of our proposed method is demonstrated through comparative numerical experiments and an application to a medical study, where our proposal achieves an improved clustering accuracy over existing benchmarking methods.
翻译:本文通过引入具有异质协方差的高维张量混合模型,研究张量总体的联合建模与聚类问题。为有效处理张量对象的高维特性,我们采用合理的降维假设,以利用张量的内在结构,如均值的低秩性和协方差的可分离性。在估计过程中,我们提出了一种高效的高维期望条件最大化(HECM)算法,该算法将M步中难以处理的优化问题分解为一系列更简单的条件优化子问题,每个子问题均为凸优化,允许正则化且具有闭式更新公式。我们的理论分析面临双重挑战:EM类估计的非凸性以及M步中仅能获得条件最大化问题的解,这导致了双重非凸性的概念。我们证明,在适当初始化下,所提出的HECM算法能以几何速度收敛到真实参数统计精度范围内的邻域。通过对比数值实验和在医学研究中的应用,我们验证了所提方法的有效性,其中我们的方法相比现有基准方法实现了更高的聚类精度。