Combining Autoregressive and Autoencoder Language Models for Text Classification

from arxiv, There is an error in the figure in page 7, where the formula and representation for an autoencoder based classifier are inconsistent and may mislead readers

This paper presents CAALM-TC (Combining Autoregressive and Autoencoder Language Models for Text Classification), a novel method that enhances text classification by integrating autoregressive and autoencoder language models. Autoregressive large language models such as Open AI's GPT, Meta's Llama or Microsoft's Phi offer promising prospects for content analysis practitioners, but they generally underperform supervised BERT based models for text classification. CAALM leverages autoregressive models to generate contextual information based on input texts, which is then combined with the original text and fed into an autoencoder model for classification. This hybrid approach capitalizes on the extensive contextual knowledge of autoregressive models and the efficient classification capabilities of autoencoders. Experimental results on four benchmark datasets demonstrate that CAALM consistently outperforms existing methods, particularly in tasks with smaller datasets and more abstract classification objectives. The findings indicate that CAALM offers a scalable and effective solution for automated content analysis in social science research that minimizes sample size requirements.

翻译：本文提出CAALM-TC（结合自回归与自编码语言模型的文本分类方法），这是一种通过整合自回归和自编码语言模型来增强文本分类性能的新方法。自回归大语言模型（如Open AI的GPT、Meta的Llama或Microsoft的Phi）为内容分析研究者提供了广阔前景，但在文本分类任务中通常表现不及基于BERT的有监督模型。CAALM利用自回归模型基于输入文本生成上下文信息，随后将该信息与原始文本结合并输入自编码模型进行分类。这种混合方法充分发挥了自回归模型广泛的情境知识优势与自编码模型高效的分类能力。在四个基准数据集上的实验结果表明，CAALM始终优于现有方法，尤其在数据规模较小且分类目标更抽象的任务中表现突出。研究结果表明，CAALM为社会科学研究中的自动化内容分析提供了一种可扩展且高效的解决方案，同时显著降低了对样本量的要求。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

专知会员服务

15+阅读 · 2022年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日