Large Language Models (LLMs) have shown exceptional abilities for multiple different natural language processing tasks. While prompting is a crucial tool for LLM inference, we observe that there is a significant cost associated with exceedingly lengthy prompts. Existing attempts to compress lengthy prompts lead to substandard results in terms of readability/interpretability of the compressed prompt, with a detrimental impact on prompt utility. To address this, we propose PromptSAW: Prompt compresSion via Relation AWare graphs, an effective strategy for prompt compression over task-agnostic and task-aware prompts. Prompt-SAW uses the prompt's textual information to build a graph and later extracts key information elements in the graph to come up with the compressed prompt. We also propose GSM8K-aug, i.e., an extended version of the existing GSM8K benchmark for task-agnostic prompts in order to provide a comprehensive evaluation platform. Experimental evaluation using benchmark datasets shows that prompts compressed by Prompt-SAW are not only better in terms of readability, but they also outperform the best-performing baseline models by up to 10.1 and 77.1, respectively, for task-agnostic and task-aware settings while compressing the original prompt text by 34.9 and 56.7.
翻译:大型语言模型(LLMs)在多种自然语言处理任务中展现出卓越能力。尽管提示是LLM推理的关键工具,但我们观察到过长的提示会带来显著成本。现有压缩长提示的尝试导致压缩提示的可读性/可解释性不佳,并对提示效用产生不利影响。为解决此问题,我们提出Prompt-SAW:基于关系感知图的提示压缩方法,这是一种针对任务无关和任务感知提示的有效压缩策略。Prompt-SAW利用提示的文本信息构建图结构,随后提取图中的关键信息元素以生成压缩提示。我们还提出了GSM8K-aug,即现有GSM8K基准的扩展版本,用于任务无关提示评估,以提供全面的评测平台。基于基准数据集的实验评估表明,经Prompt-SAW压缩的提示不仅在可读性方面更优,而且在任务无关和任务感知设置下分别以34.9%和56.7%的压缩率,其性能较最佳基线模型提升达10.1%和77.1%。