A key component of generating text from modern language models (LM) is the selection and tuning of decoding algorithms. These algorithms determine how to generate text from the internal probability distribution generated by the LM. The process of choosing a decoding algorithm and tuning its hyperparameters takes significant time, manual effort, and computation, and it also requires extensive human evaluation. Therefore, the identity and hyperparameters of such decoding algorithms are considered to be extremely valuable to their owners. In this work, we show, for the first time, that an adversary with typical API access to an LM can steal the type and hyperparameters of its decoding algorithms at very low monetary costs. Our attack is effective against popular LMs used in text generation APIs, including GPT-2 and GPT-3. We demonstrate the feasibility of stealing such information with only a few dollars, e.g., $\$0.8$, $\$1$, $\$4$, and $\$40$ for the four versions of GPT-3.
翻译:现代语言模型(LM)生成文本的一个关键组成部分是解码算法的选择与调优。这些算法决定了如何从LM产生的内部概率分布中生成文本。选择解码算法并调优其超参数的过程需要大量的时间、人工努力和计算资源,并且还需要广泛的人工评估。因此,这些解码算法的身份及其超参数对其所有者而言极具价值。在本工作中,我们首次证明,仅需以典型的API方式访问LM,攻击者便可极低成本窃取其解码算法的类型和超参数。我们的攻击对文本生成API中使用的流行LM(包括GPT-2和GPT-3)均有效。我们以仅需数美元(例如,针对GPT-3的四个版本,分别为0.8美元、1美元、4美元和40美元)便证实了窃取此类信息的可行性。