We study whether automatically-induced prompts that effectively extract information from a language model can also be used, out-of-the-box, to probe other language models for the same information. After confirming that discrete prompts induced with the AutoPrompt algorithm outperform manual and semi-manual prompts on the slot-filling task, we demonstrate a drop in performance for AutoPrompt prompts learned on a model and tested on another. We introduce a way to induce prompts by mixing language models at training time that results in prompts that generalize well across models. We conduct an extensive analysis of the induced prompts, finding that the more general prompts include a larger proportion of existing English words and have a less order-dependent and more uniform distribution of information across their component tokens. Our work provides preliminary evidence that it's possible to generate discrete prompts that can be induced once and used with a number of different models, and gives insights on the properties characterizing such prompts.
翻译:我们研究了自动生成的提示能否从某语言模型中有效提取信息,并可直接用于探测其他语言模型中的相同信息。在确认采用AutoPrompt算法生成的离散提示在槽填充任务上优于人工和半人工提示后,我们发现:在一个模型上训练得到的AutoPrompt提示应用于另一模型时性能显著下降。我们提出了一种在训练时混合多个语言模型以生成提示的方法,该方法使提示能够在不同模型间良好泛化。通过对生成提示的深入分析,我们发现:泛化能力更强的提示包含更高比例的现有英语词汇,且其信息在组成标记间的分布更均匀、顺序依赖性更弱。本研究初步证明:可生成一次性诱导并适用于多个不同模型的离散提示,同时揭示了这类提示的关键特性。