Motivated by the need for communication-efficient distributed learning, we investigate the method for compressing a unit norm vector into the minimum number of bits, while still allowing for some acceptable level of distortion in recovery. This problem has been explored in the rate-distortion/covering code literature, but our focus is exclusively on the "high-distortion" regime. We approach this problem in a worst-case scenario, without any prior information on the vector, but allowing for the use of randomized compression maps. Our study considers both biased and unbiased compression methods and determines the optimal compression rates. It turns out that simple compression schemes are nearly optimal in this scenario. While the results are a mix of new and known, they are compiled in this paper for completeness.
翻译:受通信高效分布式学习需求的驱动,我们研究在允许可接受恢复失真的前提下,将单位范数向量压缩至最少比特数的方法。该问题已在速率-失真/覆盖编码文献中得到探讨,但本研究聚焦于“高失真”场景。我们以最坏情况为基准,假设向量无任何先验信息,但允许使用随机压缩映射。研究同时考虑有偏与无偏压缩方法,并确定了最优压缩率。结果表明,在此场景下简单压缩方案近乎最优。本文整合了新旧成果,旨在提供完整的理论归纳。