Semantic Compression With Large Language Models - 专知论文

会员服务 ·

0

语言模型化 · 讲稿 · INFORMS · 原点 · MoDELS ·

2023 年 4 月 25 日

Semantic Compression With Large Language Models

翻译：大型语言模型的语义压缩

Henry Gilbert,Michael Sandborn,Douglas C. Schmidt,Jesse Spencer-Smith,Jules White

The rise of large language models (LLMs) is revolutionizing information retrieval, question answering, summarization, and code generation tasks. However, in addition to confidently presenting factually inaccurate information at times (known as "hallucinations"), LLMs are also inherently limited by the number of input and output tokens that can be processed at once, making them potentially less effective on tasks that require processing a large set or continuous stream of information. A common approach to reducing the size of data is through lossless or lossy compression. Yet, in some cases it may not be strictly necessary to perfectly recover every detail from the original data, as long as a requisite level of semantic precision or intent is conveyed. This paper presents three contributions to research on LLMs. First, we present the results from experiments exploring the viability of approximate compression using LLMs, focusing specifically on GPT-3.5 and GPT-4 via ChatGPT interfaces. Second, we investigate and quantify the capability of LLMs to compress text and code, as well as to recall and manipulate compressed representations of prompts. Third, we present two novel metrics -- Exact Reconstructive Effectiveness (ERE) and Semantic Reconstruction Effectiveness (SRE) -- that quantify the level of preserved intent between text compressed and decompressed by the LLMs we studied. Our initial results indicate that GPT-4 can effectively compress and reconstruct text while preserving the semantic essence of the original text, providing a path to leverage $\sim$5$\times$ more tokens than present limits allow.

翻译：大型语言模型（LLM）的兴起正革新信息检索、问答、摘要生成及代码生成等任务。然而，除了偶尔自信地呈现事实不准确信息（即“幻觉”）外，LLM本质上也受限于单次可处理的输入和输出令牌数量，这使其在处理大规模或连续信息流的任务中可能效率降低。数据尺寸缩小的常见方法是通过无损或有损压缩实现。然而，在某些情况下，只要传达必要的语义精度或意图，未必需要严格从原始数据中完美恢复每个细节。本文为LLM研究做出三项贡献：第一，我们通过实验探索利用LLM进行近似压缩的可行性，重点关注通过ChatGPT接口使用GPT-3.5和GPT-4的情况；第二，我们研究并量化LLM压缩文本与代码的能力，以及回忆和操作压缩提示表示的能力；第三，我们提出两个新型度量指标——精确重构有效性（ERE）与语义重构有效性（SRE）——专门量化我们研究的LLM在压缩与解压缩文本时保留意图的程度。初始结果表明，GPT-4能在保留原始文本语义精髓的同时有效压缩和重构文本，从而提供一条突破当前令牌限制约5倍的处理路径。

0

相关内容

语言模型化

语言模型化

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

55+阅读 · 2020年9月7日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

20篇「ACL2020」最新论文抢先看！看自然语言处理2020在研究什么？

20篇「ACL2020」最新论文抢先看！看自然语言处理2020在研究什么？

专知会员服务

97+阅读 · 2020年4月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

80+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

Mg-Zn-RE(Ce,Nd)系镁合金强化相析出过程与强化机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

Heregulin-α结合ErbB2/ErbB3异二聚体受体后在乳腺增生发生过程中发挥作用的机制

国家自然科学基金

0+阅读 · 2013年12月31日

气体－多组分颗粒相间作用机理与组分颗粒动力学模型的研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向文本分类的迁移学习和半监督学习方法研究

国家自然科学基金

1+阅读 · 2011年12月31日

基于微结构演变的海工混凝土多尺度传输本构与寿命预测

国家自然科学基金

0+阅读 · 2011年12月31日

基于动态应变时效的激光温喷丸强化延寿基础研究

国家自然科学基金

0+阅读 · 2011年12月31日

强磁场下高纯过共析Fe-C合金中“#21453;常”#32452;织的演变机理

国家自然科学基金

0+阅读 · 2009年12月31日

麻黄免疫抑制活性多糖的化学结构及构效关系研究

国家自然科学基金

0+阅读 · 2009年12月31日

深埋地下洞室群围岩的变形机理与稳定性评判准则研究

国家自然科学基金

0+阅读 · 2009年12月31日

耦合电脉冲与磁场的降低残余应力方法及其机理的研究

国家自然科学基金

0+阅读 · 2008年12月31日

Exploring the Responses of Large Language Models to Beginner Programmers' Help Requests

Arxiv

0+阅读 · 2023年6月9日

On the Hidden Mystery of OCR in Large Multimodal Models

Arxiv

0+阅读 · 2023年6月8日

Assessing Phrase Break of ESL Speech with Pre-trained Language Models and Large Language Models

Arxiv

0+阅读 · 2023年6月8日

Leveraging Language Identification to Enhance Code-Mixed Text Classification

Arxiv

0+阅读 · 2023年6月8日

Numeric Magnitude Comparison Effects in Large Language Models

Arxiv

0+阅读 · 2023年6月8日

The Two Word Test: A Semantic Benchmark for Large Language Models

Arxiv

0+阅读 · 2023年6月7日

ChatGPT an ENFJ, Bard an ISTJ: Empirical Study on Personalities of Large Language Models

Arxiv

0+阅读 · 2023年6月7日

LLMZip: Lossless Text Compression using Large Language Models

Arxiv

0+阅读 · 2023年6月6日

Unsupervised Domain Clusters in Pretrained Language Models

Arxiv

11+阅读 · 2020年4月5日

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Arxiv

18+阅读 · 2019年9月25日

VIP会员

文章信息

相关主题

语言模型化

最新内容

锻造未来士兵：外骨骼、基因工程与赛博格

锻造未来士兵：外骨骼、基因工程与赛博格

专知会员服务

0+阅读 · 8分钟前

《无人机系统（UAS）通信网状网络试验性部署》50页报告

《无人机系统（UAS）通信网状网络试验性部署》50页报告

专知会员服务

0+阅读 · 14分钟前

《无人机蜂群通信技术研究》50页

《无人机蜂群通信技术研究》50页

专知会员服务

0+阅读 · 25分钟前

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

专知会员服务

9+阅读 · 7月18日

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

专知会员服务

7+阅读 · 7月18日

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

专知会员服务

9+阅读 · 7月18日

综述 | 长程智能体研究全景：基础、演化、框架、优化与前沿

综述 | 长程智能体研究全景：基础、演化、框架、优化与前沿

专知会员服务

6+阅读 · 7月18日

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

专知会员服务

9+阅读 · 7月17日

《边缘端实时无线感知赋能现场多机器人部署》200页

《边缘端实时无线感知赋能现场多机器人部署》200页

专知会员服务

9+阅读 · 7月17日

战力倍增器：自主武器系统与乌克兰及加沙冲突

战力倍增器：自主武器系统与乌克兰及加沙冲突

专知会员服务

5+阅读 · 7月17日

人工智能赋能战场情报：提速决策进程

人工智能赋能战场情报：提速决策进程

专知会员服务

3+阅读 · 7月17日

《拥抱新兴技术：面向未来军官的教育革新》

《拥抱新兴技术：面向未来军官的教育革新》

专知会员服务

7+阅读 · 7月17日

ACM MM 2026 | MAR-GRPO：稳定混合图像生成的强化学习训练

ACM MM 2026 | MAR-GRPO：稳定混合图像生成的强化学习训练

专知会员服务

5+阅读 · 7月17日

综述 | 大模型水印理论与部署：来源追踪、攻击鲁棒与可信治理

综述 | 大模型水印理论与部署：来源追踪、攻击鲁棒与可信治理

专知会员服务

6+阅读 · 7月17日

《火线上的后勤保障：对抗环境下的随机规划模型研究——俄乌场景案例分析》99页

《火线上的后勤保障：对抗环境下的随机规划模型研究——俄乌场景案例分析》99页

专知会员服务

13+阅读 · 7月16日

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

55+阅读 · 2020年9月7日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

20篇「ACL2020」最新论文抢先看！看自然语言处理2020在研究什么？

20篇「ACL2020」最新论文抢先看！看自然语言处理2020在研究什么？

专知会员服务

97+阅读 · 2020年4月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

80+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《无人机蜂群通信技术研究》50页

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

《无人机系统（UAS）通信网状网络试验性部署》50页报告

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

相关论文

Exploring the Responses of Large Language Models to Beginner Programmers' Help Requests

Arxiv

0+阅读 · 2023年6月9日

On the Hidden Mystery of OCR in Large Multimodal Models

Arxiv

0+阅读 · 2023年6月8日

Assessing Phrase Break of ESL Speech with Pre-trained Language Models and Large Language Models

Arxiv

0+阅读 · 2023年6月8日

Leveraging Language Identification to Enhance Code-Mixed Text Classification

Arxiv

0+阅读 · 2023年6月8日

Numeric Magnitude Comparison Effects in Large Language Models

Arxiv

0+阅读 · 2023年6月8日

The Two Word Test: A Semantic Benchmark for Large Language Models

Arxiv

0+阅读 · 2023年6月7日

ChatGPT an ENFJ, Bard an ISTJ: Empirical Study on Personalities of Large Language Models

Arxiv

0+阅读 · 2023年6月7日

LLMZip: Lossless Text Compression using Large Language Models

Arxiv

0+阅读 · 2023年6月6日

Unsupervised Domain Clusters in Pretrained Language Models

Arxiv

11+阅读 · 2020年4月5日

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Arxiv

18+阅读 · 2019年9月25日

相关基金

Mg-Zn-RE(Ce,Nd)系镁合金强化相析出过程与强化机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

Heregulin-α结合ErbB2/ErbB3异二聚体受体后在乳腺增生发生过程中发挥作用的机制

国家自然科学基金

0+阅读 · 2013年12月31日

气体－多组分颗粒相间作用机理与组分颗粒动力学模型的研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向文本分类的迁移学习和半监督学习方法研究

国家自然科学基金

1+阅读 · 2011年12月31日

基于微结构演变的海工混凝土多尺度传输本构与寿命预测

国家自然科学基金

0+阅读 · 2011年12月31日

基于动态应变时效的激光温喷丸强化延寿基础研究

国家自然科学基金

0+阅读 · 2011年12月31日

强磁场下高纯过共析Fe-C合金中“#21453;常”#32452;织的演变机理

国家自然科学基金

0+阅读 · 2009年12月31日

麻黄免疫抑制活性多糖的化学结构及构效关系研究

国家自然科学基金

0+阅读 · 2009年12月31日

深埋地下洞室群围岩的变形机理与稳定性评判准则研究

国家自然科学基金

0+阅读 · 2009年12月31日

耦合电脉冲与磁场的降低残余应力方法及其机理的研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员