Large Language Models (LLMs) like BERT have gained significant prominence due to their remarkable performance in various natural language processing tasks. However, they come with substantial computational and memory costs. Additionally, they are essentially black-box models, challenging to explain and interpret. In this article, we propose Optimus BERT Compression and Explainability (OBCE), a methodology to bring explainability to BERT models using persistent homology, aiming to measure the importance of each neuron by studying the topological characteristics of their outputs. As a result, we can compress BERT significantly by reducing the number of parameters (58.47% of the original parameters for BERT Base, 52.3% for BERT Large). We evaluated our methodology on the standard GLUE Benchmark, comparing the results with state-of-the-art techniques and achieving outstanding results. Consequently, our methodology can "whiten" BERT models by providing explainability to its neurons and reducing the model's size, making it more suitable for deployment on resource-constrained devices.
翻译:大型语言模型(如BERT)凭借其在各类自然语言处理任务中的卓越表现获得了显著关注。然而,这些模型伴随着巨大的计算和内存开销。此外,它们本质上是黑箱模型,难以解释和诠释。本文提出Optimus BERT压缩与可解释性(OBCE)方法,利用持久同调为BERT模型赋予可解释性,通过研究神经元输出的拓扑特征来评估每个神经元的重要性。实验表明,该方法能显著压缩BERT模型(BERT Base参数减少58.47%,BERT Large参数减少52.3%)。我们在标准GLUE基准上评估了该方法,并将结果与最新技术进行对比,取得了卓越成效。因此,我们的方法可通过提供神经元可解释性并缩减模型规模来"白化"BERT模型,使其更适用于资源受限设备的部署。