This paper presents a direct framework for sequence models with hidden states on closed subgroups of U(d). We use a minimal axiomatic setup and derive recurrent and transformer templates from a shared skeleton in which subgroup choice acts as a drop-in replacement for state space, tangent projection, and update map. We then specialize to O(d) and evaluate orthogonal-state RNN and transformer models on Tiny Shakespeare and Penn Treebank under parameter-matched settings. We also report a general linear-mixing extension in tangent space, which applies across subgroup choices and improves finite-budget performance in the current O(d) experiments.
翻译:本文提出了一种直接框架,用于构建在$U(d)$闭子群上具有隐藏状态的序列模型。我们采用最小化的公理体系,从共享骨架中推导出循环神经网络与Transformer的模板架构,其中子群选择可作为状态空间、切空间投影和更新映射的即插即用模块。随后我们特化到$O(d)$群,在参数匹配设置下评估基于正交状态的RNN和Transformer模型在Tiny Shakespeare和Penn Treebank数据集上的性能。我们还报告了切空间中通用的线性混合扩展方法,该方法适用于所有子群选择,并在当前的$O(d)$实验中提升了有限预算下的性能表现。