Compressive learning is an emerging approach to drastically reduce the memory footprint of large-scale learning, by first summarizing a large dataset into a low-dimensional sketch vector, and then decoding from this sketch the latent information needed for learning. In light of recent progress on information preservation guarantees for sketches based on random features, a major objective is to design easy-to-tune algorithms (called decoders) to robustly and efficiently extract this information. To address the underlying non-convex optimization problems, various heuristics have been proposed. In the case of compressive clustering, the standard heuristic is CL-OMPR, a variant of sliding Frank-Wolfe. Yet, CL-OMPR is hard to tune, and the examination of its robustness was overlooked. In this work, we undertake a scrutinized examination of CL-OMPR to circumvent its limitations. In particular, we show how this algorithm can fail to recover the clusters even in advantageous scenarios. To gain insight, we show how the deficiencies of this algorithm can be attributed to optimization difficulties related to the structure of a correlation function appearing at core steps of the algorithm. To address these limitations, we propose an alternative decoder offering substantial improvements over CL-OMPR. Its design is notably inspired from the mean shift algorithm, a classic approach to detect the local maxima of kernel density estimators. The proposed algorithm can extract clustering information from a sketch of the MNIST dataset that is 10 times smaller than previously.
翻译:压缩学习是一种新兴方法,旨在大幅降低大规模学习的内存占用:首先将大型数据集总结为低维草图向量,然后从该草图中解码学习所需的潜在信息。基于随机特征的草图信息保留保证方面取得的最新进展,一个主要目标是设计易于调优的算法(称为解码器),以稳健且高效地提取这些信息。为处理底层非凸优化问题,研究者提出了多种启发式方法。在压缩聚类场景中,标准启发式方法是CL-OMPR,即滑动弗兰克-沃尔夫算法的一种变体。然而,CL-OMPR难以调优,且其鲁棒性分析长期被忽视。本研究对CL-OMPR进行了审慎分析以规避其局限性。我们特别展示了该算法即使在有利场景中也可能无法恢复聚类。为深入理解,我们揭示了该算法的缺陷可归因于与算法核心步骤中出现的相关函数结构相关的优化困难。为应对这些局限,我们提出了一种替代解码器,其性能相比CL-OMPR有显著提升。该设计尤其受均值移位算法(一种经典核密度估计局部最大值检测方法)的启发。所提出的算法能从比先前小10倍的MNIST数据集草图中提取聚类信息。