Lossless compression is essential for efficient data storage and transmission. Although learning-based lossless compressors achieve strong results, most of them are designed for a single modality, leading to redundant compressor deployments in multi-modal settings. Designing a unified multi-modal compressor is critical yet challenging, as different data types vary largely in format, dimension, and statistics. Multi-modal large language models offer a promising resolution but remain too complex for practical use. Thus, we propose \textbf{OmniZip}, \textbf{a unified and lightweight lossless compressor for multi-modal data (like image, text, speech, tactile, database, and gene sequence)}. Built on a lightweight backbone, OmniZip incorporates three key components to enable efficient multi-modal lossless compression: a modality-unified tokenizer that reversibly transforms diverse data into tokens, a modality-routing context learning mechanism that enables flexible multi-modal context modeling, and a modality-routing feedforward design that further enhances the model's nonlinear representation flexibility. A reparameterization training strategy is used to enhance model capacity. OmniZip outperforms or matches other state-of-the-art compressors on multiple modalities, achieving 42\%, 57\%, 62\% and 42\%, 53\% higher compression efficiency than gzip on CLIC-M, TouchandGo, enwik9, LibriSpeech, and WikiSQL datasets, respectively. It also supports near real-time inference on resource-constrained edge devices, reaching about 1MB/s on MacBook CPUs and iPhone NPUs. Our code is released at https://github.com/adminasmi/OmniZip-CVPR2026.
翻译:无损压缩对于高效的数据存储与传输至关重要。尽管基于学习的无损压缩器取得了优异的性能,但其中大多数仅针对单一模态设计,导致在多模态场景中需部署多个冗余的压缩器。设计一个统一的多模态压缩器至关重要,但也极具挑战性,因为不同数据类型在格式、维度和统计特性上差异巨大。多模态大语言模型提供了一种有前景的解决方案,但其复杂性仍难以满足实际应用需求。因此,我们提出了 \textbf{OmniZip},\textbf{一种面向多模态数据(如图像、文本、语音、触觉、数据库和基因序列)的统一轻量级无损压缩器}。OmniZip 构建于一个轻量级骨干网络之上,融合了三个关键组件以实现高效的多模态无损压缩:一个模态统一的标记器,可将多样化的数据可逆地转换为标记;一个模态路由上下文学习机制,支持灵活的多模态上下文建模;以及一个模态路由前馈设计,进一步增强了模型的非线性表示灵活性。我们采用重参数化训练策略以提升模型容量。在多种模态数据上,OmniZip 均优于或匹配其他最先进的压缩器,在 CLIC-M、TouchandGo、enwik9、LibriSpeech 和 WikiSQL 数据集上,其压缩效率分别比 gzip 高出 42%、57%、62% 和 42%、53%。同时,它支持在资源受限的边缘设备上进行近实时推理,在 MacBook CPU 和 iPhone NPU 上可达到约 1MB/s 的速度。我们的代码发布于 https://github.com/adminasmi/OmniZip-CVPR2026。