COMI: Coarse-to-fine Context Compression via Marginal Information Gain

Large Language Models (LLMs) have demonstrated exceptional capabilities across diverse tasks. However, their deployment in long context scenarios remains hindered by computational inefficiency and information redundancy. Context compression methods address these challenges by significantly reducing input length and eliminating redundancy. We propose COMI, a coarse-to-fine adaptive context compression framework that jointly optimizes for semantic relevance and diversity under high compression rates. We introduce Marginal Information Gain (MIG), a metric defined as the relevance of a unit to the input query minus its semantic redundancy with other units, guiding the compression process to prioritize information that is both relevant and low redundant. The framework operates in two stages: (1) Coarse-Grained Group Reallocation, where the context is partitioned into groups and dynamically assigned compression rates based on inter-group MIG, ensuring compression budgets align with information value distribution; and (2) Fine-Grained Token Merging, where tokens within each group are fused via an intra-group MIG-based weighting mechanism, thereby preserving key semantics while avoiding the accumulation of redundancy. Extensive experiments across question-answering (e.g., NaturalQuestions, 2WikiMQA, HotpotQA and NarrativeQA), summarization (e.g., MultiNews) with various backbones (e.g., LLaMA-2-7B, Qwen2-7B) show that COMI outperforms existing baselines by a large margin, e.g., approximately 25-point Exact Match (EM) improvement under 32x compression constraint with Qwen2-7B on NaturalQuestions.

翻译：大型语言模型（LLM）已在多样化任务中展现出卓越能力。然而，其在长上下文场景中的部署仍受计算效率低下与信息冗余问题的制约。上下文压缩方法通过显著缩短输入长度并消除冗余来应对这些挑战。本文提出COMI，一种从粗到细的自适应上下文压缩框架，可在高压缩率下联合优化语义相关性与多样性。我们引入边际信息增益（MIG）作为核心指标，其定义为单元对输入查询的相关性减去与其他单元的语义冗余度，以此指导压缩过程优先保留既相关又低冗余的信息。该框架分两阶段运行：（1）粗粒度组重分配：将上下文划分为多个组，并根据组间MIG动态分配压缩率，确保压缩预算与信息价值分布相匹配；（2）细粒度令牌融合：通过基于组内MIG的加权机制融合各组内的令牌，从而在保留关键语义的同时避免冗余累积。在问答（如NaturalQuestions、2WikiMQA、HotpotQA和NarrativeQA）、摘要（如MultiNews）等任务上，采用多种骨干模型（如LLaMA-2-7B、Qwen2-7B）的大量实验表明，COMI显著优于现有基线方法。例如在NaturalQuestions数据集上，使用Qwen2-7B模型且压缩率为32倍时，COMI的精确匹配（EM）指标提升约25个百分点。