500xCompressor: Generalized Prompt Compression for Large Language Models

Prompt compression is crucial for enhancing inference speed, reducing costs, and improving user experience. However, current methods face challenges such as low compression ratios and potential data leakage during evaluation. To address these issues, we propose 500xCompressor, a method that compresses extensive natural language contexts into a minimum of one single special token. The 500xCompressor introduces approximately 0.3% additional parameters and achieves compression ratios ranging from 6x to 480x. It is designed to compress any text, answer various types of questions, and could be utilized by the original large language model (LLM) without requiring fine-tuning. Initially, 500xCompressor was pretrained on the Arxiv Corpus, followed by fine-tuning on the ArxivQA dataset, and subsequently evaluated on strictly unseen and classical question answering (QA) datasets. The results demonstrate that the LLM retained 62.26-72.89% of its capabilities compared to using non-compressed prompts. This study also shows that not all the compressed tokens are equally utilized and that K V values have significant advantages over embeddings in preserving information at high compression ratios. The highly compressive nature of natural language prompts, even for fine-grained complex information, suggests promising potential for future applications and further research into developing a new LLM language.

翻译：提示压缩对于提升推理速度、降低成本和改善用户体验至关重要。然而，现有方法面临压缩率低、评估过程中可能存在数据泄露等挑战。为解决这些问题，我们提出了500xCompressor，该方法可将大量自然语言上下文压缩至最少一个特殊标记。500xCompressor仅引入约0.3%的额外参数，即可实现6倍至480倍的压缩比。该方法设计用于压缩任意文本、回答各类问题，且无需微调即可被原始大语言模型（LLM）直接使用。500xCompressor首先在Arxiv语料库上进行预训练，随后在ArxivQA数据集上进行微调，并在严格未见过的经典问答（QA）数据集上进行评估。结果表明，与使用未压缩提示相比，LLM保留了62.26%至72.89%的能力。本研究还发现，并非所有压缩后的标记均被同等程度地利用，且在高压缩比下，键值（K V）在信息保留方面相比嵌入表示具有显著优势。自然语言提示所表现出的高度可压缩性——即使对于细粒度复杂信息亦然——预示着未来应用的广阔潜力，并为开发新型LLM语言提供了进一步的研究方向。

相关内容

大语言模型

关注 66

大语言模型是基于海量文本数据训练的深度学习模型。它不仅能够生成自然语言文本，还能够深入理解文本含义，处理各种自然语言任务，如文本摘要、问答、翻译等。2023年，大语言模型及其在人工智能领域的应用已成为全球科技研究的热点，其在规模上的增长尤为引人注目，参数量已从最初的十几亿跃升到如今的一万亿。参数量的提升使得模型能够更加精细地捕捉人类语言微妙之处，更加深入地理解人类语言的复杂性。在过去的一年里，大语言模型在吸纳新知识、分解复杂任务以及图文对齐等多方面都有显著提升。随着技术的不断成熟，它将不断拓展其应用范围，为人类提供更加智能化和个性化的服务，进一步改善人们的生活和生产方式。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日