Large language models (LLMs) have revolutionized the field of natural language processing, extending their strong capabilities into multi-modal domains. Thus, it is vital to define proper and diversified metrics for the evaluation of LLMs. In this paper, we introduce matrix entropy, a novel metric rooted in information theory and geometry principles to quantify the data compression proficiency in LLMs. It reflects the model's ability to extract relevant information and eliminate unnecessary elements, thereby providing insight into the language model's intrinsic capability. Specifically, we demonstrate its applicability in both single-modal (language) and multi-modal settings. For language models, our findings reveal that the matrix entropy of representations follows a scaling law type reduction when the model scales up, serving as a complement to the traditional loss scaling law. For the multi-modal setting, we also propose an evaluation method based on matrix entropy for assessing alignment quality and we find that modern large multi-modal models exhibit great alignment performance.
翻译:大语言模型(LLMs)已经彻底改变了自然语言处理领域,并将其强大能力扩展到多模态领域。因此,为LLMs定义合适且多样化的评估指标至关重要。本文介绍了一种基于信息论和几何原理的新型度量——矩阵熵,用于量化LLMs的数据压缩能力。它反映了模型提取相关信息并消除不必要元素的能力,从而揭示语言模型的内在能力。具体而言,我们证明了其在单模态(语言)和多模态场景中的适用性。对于语言模型,我们的发现表明,当模型规模扩大时,表示的矩阵熵呈缩放定律式下降,这作为传统损失缩放定律的补充。对于多模态场景,我们还提出了一种基于矩阵熵的评估方法,用于衡量对齐质量,并发现现代大型多模态模型展现出了出色的对齐性能。