Expand-and-sparsify representations are a class of theoretical models that capture sparse representation phenomena observed in the sensory systems of many animals. At a high level, these representations map an input $x \in \mathbb{R}^d$ to a much higher dimension $m \gg d$ via random linear projections before zeroing out all but the $k \ll m$ largest entries. The result is a $k$-sparse vector in $\{0,1\}^m$. We study the suitability of this representation for two fundamental statistical problems: density estimation and mode estimation. For density estimation, we show that a simple linear function of the expand-and-sparsify representation produces an estimator with minimax-optimal $\ell_{\infty}$ convergence rates. In mode estimation, we provide simple algorithms on top of our density estimator that recover single or multiple modes at optimal rates up to logarithmic factors under mild conditions.
翻译:扩展与稀疏化表示是一类理论模型,用于刻画许多动物感觉系统中观察到的稀疏表示现象。从高层次看,这些表示通过随机线性投影将输入 $x \in \mathbb{R}^d$ 映射到更高维度 $m \gg d$,随后仅保留 $k \ll m$ 个最大分量并置零其余元素,最终生成 $\{0,1\}^m$ 空间中的 $k$-稀疏向量。本文研究了该表示在两类基础统计问题中的适用性:密度估计与模式估计。对于密度估计,我们证明通过对扩展与稀疏化表示进行简单线性组合,可构造出具有极小极大最优 $\ell_{\infty}$ 收敛速率的估计器。在模式估计中,我们在密度估计器基础上提出简洁算法,在温和条件下能以最优速率(至多含对数因子)恢复单模式或多模式。