We introduce a novel positional encoding strategy for Transformer-style models, addressing the shortcomings of existing, often ad hoc, approaches. Our framework provides a flexible mapping from the algebraic specification of a domain to an interpretation as orthogonal operators. This design preserves the algebraic characteristics of the source domain, ensuring that the model upholds its desired structural properties. Our scheme can accommodate various structures, ncluding sequences, grids and trees, as well as their compositions. We conduct a series of experiments to demonstrate the practical applicability of our approach. Results suggest performance on par with or surpassing the current state-of-the-art, without hyper-parameter optimizations or "task search" of any kind. Code is available at https://github.com/konstantinosKokos/ape.
翻译:本文提出了一种针对Transformer类模型的新型位置编码策略,旨在解决现有方法通常临时设计且存在不足的问题。我们的框架提供了一种从领域代数规范到正交算子解释的灵活映射。该设计保留了源领域的代数特性,确保模型保持其期望的结构性质。我们的方案能够适应多种结构,包括序列、网格和树,以及它们的组合。我们进行了一系列实验以证明该方法的实际适用性。结果表明,在无需任何超参数优化或"任务搜索"的情况下,其性能达到或超越了当前最先进水平。代码可在https://github.com/konstantinosKokos/ape获取。