We introduce a new construction of embeddings of arbitrary recursive data structures into high dimensional vectors. These embeddings provide an interpretable model for the latent state vectors of transformers. We demonstrate that these embeddings can be decoded to the original data structure when the embedding dimension is sufficiently large. This decoding algorithm has a natural implementation as a transformer. We also show that these embedding vectors can be manipulated directly to perform computations on the underlying data without decoding. As an example we present an algorithm that constructs the embedded parse tree of an embedded token sequence using only vector operations in embedding space.
翻译:我们提出了一种将任意递归数据结构嵌入高维向量的新构造方法。这类嵌入为Transformer的隐状态向量提供了可解释的模型。我们证明,当嵌入维度足够大时,这些嵌入可被解码为原始数据结构。该解码算法在Transformer中具有自然的实现方式。我们还表明,无需解码即可直接操作这些嵌入向量,对底层数据进行计算。作为示例,我们提出了一种算法,该算法仅利用嵌入空间中的向量操作即可构建已嵌入令牌序列的解析树。