Sharp Capacity Thresholds in Linear Associative Memory: From Winner-Take-All to Listwise Retrieval

How many key-value associations can a $d\times d$ linear memory store? We show that the answer depends not only on the $d^2$ degrees of freedom in the memory matrix, but also on the retrieval criterion. In an isotropic Gaussian model for the stored pairs, we show that top-1 retrieval, where every signal must beat its largest distractor, requires the logarithmic model-size scale $d^2\asymp n\log n$. We prove that the correlation matrix memory construction, which stores associations by superposing key-target outer products, achieves this scale through a sharp phase transition, and that the same scaling is necessary for any linear memory. Thus the logarithm is the intrinsic extreme-value price of winner-take-all decoding. We next consider listwise retrieval, where the correct target need not be the unique top-scoring item but should remain among the strongest candidates. To formalize this regime, we propose the Tail-Average Margin (TAM), a convex upper-tail criterion that certifies inclusion of the correct target in a controlled candidate list. Under this listwise retrieval criterion, the capacity follows the quadratic scale $d^2\asymp n$. At load $n/d^2\toα$, we develop an exact asymptotic theory for the TAM empirical-risk minimizer through a two-parameter scalar variational principle. The theory has a rich phenomenology: in the ridgeless limit it yields a closed-form critical load separating satisfiable and unsatisfiable phases, and it predicts the limiting laws of true scores, competitor scores, margins, and percentile profiles. Finally, a small-tail extrapolation further leads to the conjectural sharp top-1 threshold $d^2\sim 2n\log n$.

翻译：一个 $d\times d$ 线性记忆能存储多少键值关联？我们证明答案不仅取决于记忆矩阵中的 $d^2$ 个自由度，还取决于检索准则。在存储对的各向同性高斯模型中，我们展示：顶一检索（每个信号必须击败其最大干扰项）要求对数模型规模尺度 $d^2\asymp n\log n$。我们证明，通过叠加键-目标外积来存储关联的相关矩阵记忆构造，通过尖锐相变达到此尺度，且任何线性记忆都需要相同的缩放比例。因此对数函数是胜者全得解码固有的极值代价。接下来我们考虑列表检索，其中正确目标不必是唯一最高分项，但应保持在最强候选者之列。为形式化此机制，我们提出尾部平均裕度（TAM），这是一种凸上尾准则，可确保正确目标被包含在受控候选列表中。在此列表检索准则下，容量遵循二次尺度 $d^2\asymp n$。当负载 $n/d^2\toα$ 时，我们通过双参数标量变分原理为 TAM 经验风险最小化器建立精确渐近理论。该理论具有丰富现象：在无脊极限下，它产生闭合形式的临界负载，区分可满足相与不可满足相，并预测真实分数、竞争者分数、裕度及百分位数分布的极限定律。最后，通过小尾外推进一步导出猜想性的尖锐顶一阈值 $d^2\sim 2n\log n$。