Modern AI models are growing rapidly in size and redundancy, leading to significant storage and distribution challenges in model hubs. We present TensorHub, a tensor-centric system for reducing storage overhead through fine-grained deduplication and compression. TensorHub leverages tensor-level fingerprinting and clustering to identify redundancy across models without requiring annotations. Our design enables efficient storage reduction while preserving model usability and performance. Experiments on real-world model repositories demonstrate substantial storage savings with minimal overhead.
翻译:现代AI模型在规模和冗余度方面快速增长,导致模型仓库面临显著的存储与分发挑战。本文提出TensorHub——一种以张量为中心的存储压缩系统,通过细粒度去重与压缩技术降低存储开销。该系统利用张量级指纹识别与聚类技术,无需标注即可跨模型识别冗余结构。我们的设计在保证模型可用性与性能的前提下实现了高效存储压缩。在真实模型仓库上的实验表明,该系统能以极小开销实现显著存储节省。