Concepts play a pivotal role in various human cognitive functions, including learning, reasoning and communication. However, there is very little work on endowing machines with the ability to form and reason with concepts. In particular, state-of-the-art large language models (LLMs) work at the level of tokens, not concepts. In this work, we analyze how well contemporary LLMs capture human concepts and their structure. We then discuss ways to develop concept-aware LLMs, taking place at different stages of the pipeline. We sketch a method for pretraining LLMs using concepts, and also explore the simpler approach that uses the output of existing LLMs. Despite its simplicity, our proof-of-concept is shown to better match human intuition, as well as improve the robustness of predictions. These preliminary results underscore the promise of concept-aware LLMs.
翻译:概念在人类学习、推理与交流等多种认知功能中发挥着关键作用。然而,目前鲜有研究赋予机器形成和推理概念的能力。尤其值得注意的是,当前最先进的大型语言模型(LLMs)仅在词元级别运作,而非概念层面。本研究分析了现有LLMs对人类概念及其结构捕捉能力的程度,进而探讨了在不同流程阶段开发概念感知型LLMs的方案。我们勾勒了一种利用概念预训练LLMs的方法,同时探索了利用现有LLMs输出的更简化路径。尽管方法简单,但概念验证实验表明,该方法不仅更契合人类直觉,还提升了预测的鲁棒性。这些初步成果凸显了概念感知型语言模型的发展潜力。