The sparse matrix compression problem asks for a one-dimensional representation of a binary $n \times \ell$ matrix, formed by an integer array of row indices and a shift function for each row, such that accessing a matrix entry is possible in constant time by consulting this representation. It has been shown that the decision problem for finding an integer array of length $\ell+ρ$ or restricting the shift function up to values of $ρ$ is NP-complete (cf. the textbook of Garey and Johnson). As a practical heuristic, a greedy algorithm has been proposed to shift the $i$-th row until it forms a solution with its predecessor rows. Despite that this greedy algorithm is cherished for its good approximation in practice, we show that it actually exhibits an approximation ratio of $Θ(\sqrt{\ell+ρ})$. We give further hardness results for parameterizations such as the number of distinct rows or the maximum number of non-zero entries per row. Finally, we devise a DP-algorithm that solves the problem for double-logarithmic matrix widths or logarithmic widths for further restrictions. We study all these findings also under a new perspective by introducing a variant of the problem, where we wish to minimize the length of the resulting integer array by trimming the non-zero borders, which has not been studied in the literature before but has practical motivations.
翻译:稀疏矩阵压缩问题要求为二元 $n \times \ell$ 矩阵寻找一种一维表示,该表示由一个行索引整数数组及每行的移位函数构成,使得通过查询此表示可在常数时间内访问矩阵元素。已有研究表明,寻找长度为 $\ell+ρ$ 的整数数组或将移位函数值限制在 $ρ$ 范围内的判定问题是 NP 完全问题(参见 Garey 和 Johnson 的教科书)。作为一种实用启发式方法,已有贪婪算法被提出,该算法通过将第 $i$ 行移位直至与其前驱行形成解。尽管该贪婪算法在实践中因良好的近似性能而备受青睐,我们证明其实际近似比率为 $Θ(\sqrt{\ell+ρ})$。我们进一步针对不同参数化(如相异行数或每行最大非零元素数量)给出了难度结果。最后,我们设计了一种动态规划算法,该算法可在双对数矩阵宽度或对数宽度(需附加限制条件)下求解该问题。我们通过引入该问题的一个新变体来重新审视所有发现:该变体旨在通过裁剪非零边界来最小化最终整数数组的长度,此问题虽具有实际应用背景,但此前尚未在文献中得到研究。