Sequence learning reduces to similarity-based retrieval over a temporally indexed representation space, a constraint on any sequence model, not a property of a specific architecture. We show that a spiking Sparse Distributed Memory sequence machine (2007) and the transformer (2017) independently instantiate the same five functional operations (encoding, context maintenance, associative retrieval, storage, and decoding), with cosine similarity as the shared retrieval primitive in both. We formalise a Phase-Latency Isomorphism showing that sinusoidal positional phase and spike timing are linearly related, and prove that dot product attention is invariant to this mapping up to a global scale factor on the positional component (Lemma 1). Empirically, frequency-compressed positional encoding fails to converge on a positionally demanding copy task, while a learned rank-based embedding matches or exceeds sinusoidal encoding, indicating that the critical property for positional representation is distance discriminability under dot-product similarity, not sinusoidal form. Time, phase, and rank are three instantiations of the same computational primitive, an ordered index whose structure survives similarity-based retrieval.
翻译:序列学习可归结为基于相似性的时序索引表示空间检索,这是任何序列模型都必须遵循的约束条件,而非特定架构的属性。我们发现脉冲稀疏分布式记忆序列机器(2007年)与Transformer(2017年)独立实现了相同的五项功能操作(编码、语境维护、联想检索、存储与解码),且两者均以余弦相似度作为共享的检索基元。我们提出相位-延迟同构理论,证明正弦式位置相位与脉冲时序存在线性关联,并进一步证明点积注意力在该映射下仅存在位置分量的全局尺度因子不变性(引理1)。实验表明:在需要精确位置区分的复制任务中,频率压缩式位置编码无法收敛,而基于排序的学习嵌入在性能上达到或超越正弦编码——这揭示位置表征的关键属性是点积相似性下的距离可区分性,而非正弦函数形式。时间、相位与排序是同一计算基元的三种实例化形式,即其结构能经受相似性检索检验的有序索引。