Large language models (LLMs) can be used as accessible and intelligent chatbots by constructing natural language queries and directly inputting the prompt into the large language model. However, different prompt' constructions often lead to uncertainty in the answers and thus make it hard to utilize the specific knowledge of LLMs (like ChatGPT). To alleviate this, we use an interpretable structure to explain the prompt learning principle in LLMs, which certificates that the effectiveness of language models is determined by position changes of the task's related tokens. Therefore, we propose MTPrompt, a multi-dimensional task prompt learning method consisting based on task-related object, summary, and task description information. By automatically building and searching for appropriate prompts, our proposed MTPrompt achieves the best results on few-shot samples setting and five different datasets. In addition, we demonstrate the effectiveness and stability of our method in different experimental settings and ablation experiments. In interaction with large language models, embedding more task-related information into prompts will make it easier to stimulate knowledge embedded in large language models.
翻译:摘要:大型语言模型(LLMs)可通过构建自然语言查询并将提示直接输入模型,作为易用且智能的聊天机器人。然而,不同提示构建方式常导致答案的不确定性,从而难以充分利用LLMs(如ChatGPT)的特定知识。为缓解此问题,我们采用可解释结构阐明LLMs中的提示学习原理,证明语言模型的有效性取决于任务相关令牌的位置变化。基于此,我们提出MTPrompt方法——一种由任务相关对象、摘要及任务描述信息构成的多维任务提示学习方法。通过自动构建与搜索最优提示,所提MTPrompt在少样本设置及五个不同数据集上取得了最佳结果。此外,我们通过不同实验设置及消融实验验证了该方法的有效性与稳定性。在与大型语言模型的交互中,将更多任务相关信息嵌入提示,将更易激发LLMs内嵌的知识。