Motivated by the need for communication-efficient distributed learning, we investigate the method for compressing a unit norm vector into the minimum number of bits, while still allowing for some acceptable level of distortion in recovery. This problem has been explored in the rate-distortion/covering code literature, but our focus is exclusively on the "high-distortion" regime. We approach this problem in a worst-case scenario, without any prior information on the vector, but allowing for the use of randomized compression maps. Our study considers both biased and unbiased compression methods and determines the optimal compression rates. It turns out that simple compression schemes are nearly optimal in this scenario. While the results are a mix of new and known, they are compiled in this paper for completeness.
翻译:受通信高效分布式学习需求的驱动,我们研究了将单位范数向量压缩为最少比特数的方法,同时允许在恢复过程中存在可接受的失真程度。该问题已在率失真/覆盖编码文献中得到探索,但我们的关注点仅限于"高失真"场景。我们在最坏情况下处理此问题,不假设向量具有任何先验信息,但允许使用随机压缩映射。本研究同时考虑了有偏压缩与无偏压缩方法,并确定了最优压缩速率。结果表明,简单压缩方案在此场景下近乎最优。尽管研究结果包含新发现与已知结论,但本文将其进行系统汇编以保持完整性。