大型语言模型在专利分类中的应用：优势、权衡与长尾效应 (Large Language Models for Patent Classification: Strengths, Trade-offs, and the Long Tail Effect)

Patent classification into CPC codes underpins large scale analyses of technological change but remains challenging due to its hierarchical, multi label, and highly imbalanced structure. While pre Generative AI supervised encoder based models became the de facto standard for large scale patent classification, recent advances in large language models (LLMs) raise questions about whether they can provide complementary capabilities, particularly for rare or weakly represented technological categories. In this work, we perform a systematic comparison of encoder based classifiers (BERT, SciBERT, and PatentSBERTa) and open weight LLMs on a highly imbalanced benchmark dataset (USPTO 70k). We evaluate LLMs under zero shot, few shot, and retrieval augmented prompting, and further assess parameter efficient fine tuning of the best performing model. Our results show that encoder based models achieve higher aggregate performance, driven by strong results on frequent CPC subclasses, but struggle on rare ones. In contrast, LLMs achieve relatively higher performance on infrequent subclasses, often associated with early stage, cross domain, or weakly institutionalised technologies, particularly at higher hierarchical levels. These findings indicate that encoder based and LLM based approaches play complementary roles in patent classification. We additionally quantify inference time and energy consumption, showing that encoder based models are up to three orders of magnitude more efficient than LLMs. Overall, our results inform responsible patentometrics and technology mapping, and motivate hybrid classification approaches that combine encoder efficiency with the long tail coverage of LLMs under computational and environmental constraints.

翻译：将专利分类至CPC代码是支撑技术变革大规模分析的基础，但由于其层级化、多标签及高度不平衡的结构，该任务仍具挑战性。在生成式人工智能兴起前，基于编码器的预训练监督模型已成为大规模专利分类的事实标准，而大型语言模型（LLMs）的最新进展引发了其是否能提供互补能力的疑问，尤其是在处理罕见或代表性不足的技术类别时。本研究基于高度不平衡的基准数据集（USPTO 70k），对基于编码器的分类器（BERT、SciBERT和PatentSBERTa）与开源权重LLMs进行了系统比较。我们在零样本、少样本和检索增强提示三种设定下评估LLMs，并进一步对表现最佳的模型进行参数高效微调。结果显示，基于编码器的模型凭借在常见CPC子类上的优异表现获得更高的综合性能，但在罕见子类上表现欠佳。相比之下，LLMs在非常见子类上取得相对更高的性能，这些子类通常与早期阶段、跨领域或制度化程度较低的技术相关，尤其在较高层级上更为明显。这些发现表明基于编码器的方法与基于LLM的方法在专利分类中具有互补作用。我们还量化了推理时间和能耗，显示基于编码器的模型比LLMs的效率高出最多三个数量级。总体而言，本研究为负责任的专利计量学与技术图谱绘制提供了依据，并推动在计算与环境约束下，结合编码器效率与LLM长尾覆盖能力的混合分类方法的发展。