Stochastic trace estimation is a standard tool for approximating the trace of a large-scale matrix available only through matrix-vector products. However, in tensor-structured settings, unstructured Gaussian or Rademacher test vectors may be prohibitively expensive to store and compute with, while cheaper rank-one tensor-product vectors can require sample complexities that grow exponentially with the tensor order. This work studies Gaussian random tensor train vectors as a structured alternative for stochastic trace estimation. We show that, with a suitable choice of the tensor train rank, random tensor train vectors recover dimension-independent guarantees for the Girard--Hutchinson estimator. In particular, a median-of-means variant with tensor train rank $r \geq d-1$ achieves the same dependence on the accuracy $\varepsilon$ and failure probability $δ$ as the classical estimator based on unstructured Gaussian vectors. We further prove an oblivious subspace injection result for sketches formed from independent Gaussian random tensor train vectors: tensor train rank $r\geq d-1$ and $\mathcal{O}(\varepsilon^{-2}(k+\log(1/δ)))$ samples suffice for a $k$-dimensional target subspace. Finally, we investigate the use of such sketches within the Nyström++ framework. We show that the resulting estimator can achieve the desired $\mathcal{O}(\varepsilon^{-1})$ sample complexity under an additional spectral-tail condition. These results provide clarififcation on both the potential and the limitations of random tensor train vectors in stochastic trace estimation.
翻译:随机迹估计是一种标准工具,用于近似仅能通过矩阵-向量乘积获得的大规模矩阵的迹。然而,在张量结构化场景中,非结构化的高斯或拉德马赫测试向量在存储和计算上可能成本高昂,而更廉价的秩一张量积向量所需的样本复杂度会随张量阶数呈指数增长。本研究将高斯随机张量列向量作为随机迹估计的结构化替代方案。我们发现,通过适当选择张量列秩,随机张量列向量可使吉拉德-哈钦森估计器具有与维度无关的保证。特别地,基于张量列秩 $r \geq d-1$ 的均值中位数变体在精度 $\varepsilon$ 和失败概率 $δ$ 上实现了与非结构化高斯向量的经典估计器相同的依赖性。我们进一步证明了由独立的随机高斯张量列向量形成的草图具有无偏子空间注入性质:对于 $k$ 维目标子空间,张量列秩 $r \geq d-1$ 及 $\mathcal{O}(\varepsilon^{-2}(k+\log(1/δ)))$ 个样本即可满足要求。最后,我们研究了此类草图在Nyström++框架中的应用。结果表明,在额外谱尾条件下,所得估计器可实现所需的 $\mathcal{O}(\varepsilon^{-1})$ 样本复杂度。这些结果阐明了随机张量列向量在随机迹估计中的潜力与局限性。