语言模型是单射的，因此可逆 (Language Models are Injective and Hence Invertible)

Transformer components such as non-linear activations and normalization are inherently non-injective, suggesting that different inputs could map to the same output and prevent exact recovery of the input from a model's representations. In this paper, we challenge this view. First, we prove mathematically that transformer language models mapping discrete input sequences to their corresponding sequence of continuous representations are injective and therefore lossless, a property established at initialization and preserved during training. Second, we confirm this result empirically through billions of collision tests on six state-of-the-art language models, and observe no collisions. Third, we operationalize injectivity: we introduce SipIt, the first algorithm that provably and efficiently reconstructs the exact input text from hidden activations, establishing linear-time guarantees and demonstrating exact invertibility in practice. Overall, our work establishes injectivity as a fundamental and exploitable property of language models, with direct implications for transparency, interpretability, and safe deployment.

翻译：Transformer组件（如非线性激活和归一化）本质上是非单射的，这表明不同的输入可能映射到相同的输出，从而阻碍从模型表示中精确恢复输入。本文中，我们挑战了这一观点。首先，我们从数学上证明了将离散输入序列映射到其对应连续表示序列的Transformer语言模型是单射的，因此是无损的；这一性质在初始化时即成立，并在训练过程中得以保持。其次，我们通过对六个最先进的语言模型进行数十亿次碰撞测试，实证验证了这一结果，且未观察到任何碰撞。第三，我们将单射性付诸实践：提出了SipIt算法，这是首个能够从隐藏激活中可证明且高效地重建精确输入文本的算法，该算法具有线性时间保证，并在实践中实现了精确可逆性。总体而言，我们的工作确立了单射性作为语言模型的一种基本且可利用的性质，对模型透明度、可解释性及安全部署具有直接意义。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日