Semantic feature norms, lists of features that concepts do and do not possess, have played a central role in characterizing human conceptual knowledge, but require extensive human labor. Large language models (LLMs) offer a novel avenue for the automatic generation of such feature lists, but are prone to significant error. Here, we present a new method for combining a learned model of human lexical-semantics from limited data with LLM-generated data to efficiently generate high-quality feature norms.
翻译:语义特征常模——概念所具备与不具备的特征列表——在表征人类概念知识方面发挥着核心作用,但其编制需要大量人力。大语言模型(LLMs)为自动生成此类特征列表提供了新途径,但容易产生显著错误。本文提出一种新方法,将基于有限数据学习到的人类词汇语义模型与LLM生成的数据相结合,以高效生成高质量的特征常模。