Nobody knows how language works, but many theories abound. Transformers are a class of neural networks that process language automatically with more success than alternatives, both those based on neural computations and those that rely on other (e.g. more symbolic) mechanisms. Here, I highlight direct connections between the transformer architecture and certain theoretical perspectives on language. The empirical success of transformers relative to alternative models provides circumstantial evidence that the linguistic approaches that transformers embody should be, at least, evaluated with greater scrutiny by the linguistics community and, at best, considered to be the currently best available theories.
翻译:无人确知语言运作机制,但理论众说纷纭。Transformer是一类神经网络模型,其在语言自动化处理方面的成效优于其他方案——无论是基于神经计算的模型,还是依赖其他(如更符号化的)机制的模型。本文着重探讨Transformer架构与特定语言理论观点之间的直接关联。相较于替代模型,Transformer所取得的实证成功提供了间接证据,表明其所体现的语言学研究路径至少应受到语言学界更严格的审视评估,甚至可被视为当前最优的理论体系。