As new research on Large Language Models (LLMs) continues, it is difficult to keep up with new research and models. To help researchers synthesize the new research many have written survey papers, but even those have become numerous. In this paper, we develop a method to automatically assign survey papers to a taxonomy. We collect the metadata of 144 LLM survey papers and explore three paradigms to classify papers within the taxonomy. Our work indicates that leveraging graph structure information on co-category graphs can significantly outperform the language models in two paradigms; pre-trained language models' fine-tuning and zero-shot/few-shot classifications using LLMs. We find that our model surpasses an average human recognition level and that fine-tuning LLMs using weak labels generated by a smaller model, such as the GCN in this study, can be more effective than using ground-truth labels, revealing the potential of weak-to-strong generalization in the taxonomy classification task.
翻译:随着大语言模型研究的持续深入,追踪新研究和模型变得愈发困难。为帮助研究者综合梳理最新成果,大量综述论文应运而生,但此类论文本身也已数量激增。本文开发了一种自动将综述论文归类至分类体系的方法。我们收集了144篇大语言模型综述论文的元数据,并探索了三种分类范式以实现体系内论文分类。研究结果表明,在利用共类别图的图结构信息时,其表现可显著超越基于语言模型的两类范式:预训练语言模型的微调方法以及基于大语言模型的零样本/少样本分类。我们发现,本模型超越了人类平均识别水平,且利用小模型(如本研究中图卷积网络(GCN))生成的弱标签对大语言模型进行微调,其效果优于使用真实标签,这揭示了弱到强泛化在分类体系任务中的潜力。