Concatenated Matrix SVD: Compression Bounds, Incremental Approximation, and Error-Constrained Clustering

Large collections of matrices arise throughout modern machine learning, signal processing, and scientific computing, where they are commonly compressed by concatenation followed by truncated singular value decomposition (SVD). This strategy enables parameter sharing and efficient reconstruction and has been widely adopted across domains ranging from multi-view learning and signal processing to neural network compression. However, it leaves a fundamental question unanswered: which matrices can be safely concatenated and compressed together under explicit reconstruction error constraints? Existing approaches rely on heuristic or architecture-specific grouping and provide no principled guarantees on the resulting SVD approximation error. In the present work, we introduce a theory-driven framework for compression-aware clustering of matrices under SVD compression constraints. Our analysis establishes new spectral bounds for horizontally concatenated matrices, deriving global upper bounds on the optimal rank-$r$ SVD reconstruction error from lower bounds on singular value growth. The first bound follows from Weyl-type monotonicity under blockwise extensions, while the second leverages singular values of incremental residuals to yield tighter, per-block guarantees. We further develop an efficient approximate estimator based on incremental truncated SVD that tracks dominant singular values without forming the full concatenated matrix. Therefore, we propose three clustering algorithms that merge matrices only when their predicted joint SVD compression error remains below a user-specified threshold. The algorithms span a trade-off between speed, provable accuracy, and scalability, enabling compression-aware clustering with explicit error control.

翻译：现代机器学习、信号处理和科学计算中涌现出大量矩阵集合，这些矩阵通常通过拼接后截断奇异值分解（SVD）进行压缩。该策略可实现参数共享与高效重构，并已被广泛应用于多视图学习、信号处理及神经网络压缩等领域。然而，这留下了一个基本问题：在显式重构误差约束下，哪些矩阵可以安全地拼接并压缩？现有方法依赖启发式或特定架构的矩阵分组，无法提供关于SVD近似误差的原则性保证。本文提出一个理论驱动的框架，用于在SVD压缩约束下实现矩阵的压缩感知聚类。我们的分析建立了水平拼接矩阵的新谱界，从奇异值增长的下界推导最优秩$r$ SVD重构误差的全局上界。第一个界基于分块扩展下的Weyl型单调性，第二个界通过增量残差的奇异值获得更紧的逐块保证。我们进一步开发了一种基于增量截断SVD的高效近似估计器，无需构造完整拼接矩阵即可追踪主导奇异值。基于此，我们提出三种聚类算法，仅当预测的联合SVD压缩误差低于用户指定阈值时才合并矩阵。这些算法在速度、可证明准确性与可扩展性之间实现权衡，从而支持具有显式误差控制的压缩感知聚类。