The rise of large language models (LLMs) is revolutionizing information retrieval, question answering, summarization, and code generation tasks. However, in addition to confidently presenting factually inaccurate information at times (known as "hallucinations"), LLMs are also inherently limited by the number of input and output tokens that can be processed at once, making them potentially less effective on tasks that require processing a large set or continuous stream of information. A common approach to reducing the size of data is through lossless or lossy compression. Yet, in some cases it may not be strictly necessary to perfectly recover every detail from the original data, as long as a requisite level of semantic precision or intent is conveyed. This paper presents three contributions to research on LLMs. First, we present the results from experiments exploring the viability of approximate compression using LLMs, focusing specifically on GPT-3.5 and GPT-4 via ChatGPT interfaces. Second, we investigate and quantify the capability of LLMs to compress text and code, as well as to recall and manipulate compressed representations of prompts. Third, we present two novel metrics -- Exact Reconstructive Effectiveness (ERE) and Semantic Reconstruction Effectiveness (SRE) -- that quantify the level of preserved intent between text compressed and decompressed by the LLMs we studied. Our initial results indicate that GPT-4 can effectively compress and reconstruct text while preserving the semantic essence of the original text, providing a path to leverage $\sim$5$\times$ more tokens than present limits allow.
翻译:大型语言模型(LLM)的兴起正革新信息检索、问答、摘要生成及代码生成等任务。然而,除了偶尔自信地呈现事实不准确信息(即“幻觉”)外,LLM本质上也受限于单次可处理的输入和输出令牌数量,这使其在处理大规模或连续信息流的任务中可能效率降低。数据尺寸缩小的常见方法是通过无损或有损压缩实现。然而,在某些情况下,只要传达必要的语义精度或意图,未必需要严格从原始数据中完美恢复每个细节。本文为LLM研究做出三项贡献:第一,我们通过实验探索利用LLM进行近似压缩的可行性,重点关注通过ChatGPT接口使用GPT-3.5和GPT-4的情况;第二,我们研究并量化LLM压缩文本与代码的能力,以及回忆和操作压缩提示表示的能力;第三,我们提出两个新型度量指标——精确重构有效性(ERE)与语义重构有效性(SRE)——专门量化我们研究的LLM在压缩与解压缩文本时保留意图的程度。初始结果表明,GPT-4能在保留原始文本语义精髓的同时有效压缩和重构文本,从而提供一条突破当前令牌限制约5倍的处理路径。