Large Language Models (LLMs) are regularly being used to label data across many domains and for myriad tasks. By simply asking the LLM for an answer, or ``prompting,'' practitioners are able to use LLMs to quickly get a response for an arbitrary task. This prompting is done through a series of decisions by the practitioner, from simple wording of the prompt, to requesting the output in a certain data format, to jailbreaking in the case of prompts that address more sensitive topics. In this work, we ask: do variations in the way a prompt is constructed change the ultimate decision of the LLM? We answer this using a series of prompt variations across a variety of text classification tasks. We find that even the smallest of perturbations, such as adding a space at the end of a prompt, can cause the LLM to change its answer. Further, we find that requesting responses in XML and commonly used jailbreaks can have cataclysmic effects on the data labeled by LLMs.
翻译:大型语言模型(LLMs)正被常规用于跨多领域及无数任务的数据标注。通过简单向LLM提问(即"提示"),实践者能够利用LLM快速获取任意任务的回应。这种提示过程涉及实践者的系列决策,从提示的简单措辞、要求以特定数据格式输出,到针对敏感话题提示时的越狱操作。本研究提出疑问:提示构建方式的变异是否会改变LLM的最终决策?我们通过多个文本分类任务中的系列提示变异来回答此问题。研究发现,即便最微小的扰动(如提示末尾添加空格)也能导致LLM改变答案。此外,我们发现要求以XML格式响应以及常见的越狱操作,会对LLM标注的数据产生灾难性影响。