In transformer architectures, position encoding primarily provides a sense of sequence for input tokens. While the original transformer paper's method has shown satisfactory results in general language processing tasks, there have been new proposals, such as Rotary Position Embedding (RoPE), for further improvement. This paper presents geotokens, input components for transformers, each linked to a specific geological location. Unlike typical language sequences, for these tokens, the order is not as vital as the geographical coordinates themselves. To represent the relative position in this context and to keep a balance between the real world distance and the distance in the embedding space, we design a position encoding approach drawing from the RoPE structure but tailored for spherical coordinates.
翻译:在Transformer架构中,位置编码主要提供输入令牌的序列感。虽然原始Transformer论文的方法在通用语言处理任务中显示出令人满意的结果,但已有如旋转位置嵌入(RoPE)等新方案提出以进一步改进。本文提出地理令牌作为Transformer的输入组件,每个令牌关联特定地理位置。与典型语言序列不同,对于这些令牌而言,顺序的重要性远不及地理坐标本身。为在此情境中表示相对位置,并保持真实世界距离与嵌入空间距离之间的平衡,我们设计了一种位置编码方法,该方法借鉴RoPE结构但针对球面坐标进行了定制。