The transformer architecture, introduced by Vaswani et al. (2017), is at the heart of the remarkable recent progress in the development of language models, including widely-used chatbots such as Chat-GPT and Claude. In this paper, I argue that we can extract from the way the transformer architecture works a theory of the relationship between context and meaning. I call this the transformer theory, and I argue that it is novel with regard to two related philosophical debates: the contextualism debate regarding the extent of context-sensitivity across natural language, and the polysemy debate regarding how polysemy should be captured within an account of word meaning.
翻译:由 Vaswani 等人(2017)提出的 Transformer 架构,是近期语言模型(包括广泛使用的聊天机器人如 Chat-GPT 和 Claude)取得显著进展的核心。本文认为,我们可以从 Transformer 架构的工作方式中,提炼出一种关于语境与意义关系的理论。我称之为 Transformer 理论,并论证该理论对于两个相关的哲学争论具有新颖性:一是关于自然语言中语境敏感性范围的语境主义之争,二是关于在一词多义现象中应如何在词汇意义理论中捕捉其本质的一词多义之争。