Prompting is a common approach for leveraging LMs in zero-shot settings. However, the underlying mechanisms that enable LMs to perform diverse tasks without task-specific supervision remain poorly understood. Studying the relationship between prompting and the quality of internal representations can shed light on how pre-trained embeddings may support in-context task solving. In this empirical study, we conduct a series of probing experiments on prompt embeddings, analyzing various combinations of prompt templates for zero-shot classification. Our findings show that while prompting affects the quality of representations, these changes do not consistently correlate with the relevance of the prompts to the target task. This result challenges the assumption that more relevant prompts necessarily lead to better representations. We further analyze potential factors that may contribute to this unexpected behavior.
翻译:提示是零样本场景下利用语言模型的常用方法。然而,语言模型无需任务特定监督即可执行多样化任务的内在机制仍不甚明晰。研究提示与内部表征质量之间的关系,有助于揭示预训练嵌入如何支持上下文任务求解。在本实证研究中,我们对提示嵌入进行了一系列探测实验,分析了零样本分类中不同提示模板组合的效果。研究发现:虽然提示会影响表征质量,但这些变化与提示对目标任务的相关性并未呈现稳定关联。这一结果挑战了"相关性更强的提示必然产生更优表征"的假设。我们进一步分析了可能导致这种意外现象的影响因素。