Large Language Models (LLMs) have revolutionized natural language processing (NLP) by delivering state-of-the-art performance across a variety of tasks. Among these, Transformer-based models like BERT and GPT rely on pooling layers to aggregate token-level embeddings into sentence-level representations. Common pooling mechanisms such as Mean, Max, and Weighted Sum play a pivotal role in this aggregation process. Despite their widespread use, the comparative performance of these strategies on different LLM architectures remains underexplored. To address this gap, this paper investigates the effects of these pooling mechanisms on two prominent LLM families -- BERT and GPT, in the context of sentence-level sentiment analysis. Comprehensive experiments reveal that each pooling mechanism exhibits unique strengths and weaknesses depending on the task's specific requirements. Our findings underline the importance of selecting pooling methods tailored to the demands of particular applications, prompting a re-evaluation of common assumptions regarding pooling operations. By offering actionable insights, this study contributes to the optimization of LLM-based models for downstream tasks.
翻译:大型语言模型(LLMs)通过在各种任务中提供最先进的性能,彻底改变了自然语言处理(NLP)。其中,基于Transformer的模型(如BERT和GPT)依赖池化层将词元级嵌入聚合为句子级表示。常见的池化机制(如均值池化、最大值池化和加权求和池化)在此聚合过程中起着关键作用。尽管这些策略被广泛使用,但它们在LLM不同架构上的比较性能仍未得到充分研究。为填补这一空白,本文在句子级情感分析背景下,研究了这些池化机制对两个主要LLM系列——BERT和GPT的影响。全面的实验表明,每种池化机制根据任务的具体需求展现出独特的优势和劣势。我们的研究结果强调了根据特定应用需求选择池化方法的重要性,促使我们重新评估关于池化操作的常见假设。通过提供可操作的见解,本研究有助于优化基于LLM的下游任务模型。