Despite strong performance on many tasks, large language models (LLMs) show limited ability in historical and cultural reasoning, particularly in non-English contexts such as Chinese history. Taxonomic structures offer an effective mechanism to organize historical knowledge and improve understanding. However, manual taxonomy construction is costly and difficult to scale. Therefore, we propose \textbf{CHisAgent}, a multi-agent LLM framework for historical taxonomy construction in ancient Chinese contexts. CHisAgent decomposes taxonomy construction into three role-specialized stages: a bottom-up \textit{Inducer} that derives an initial hierarchy from raw historical corpora, a top-down \textit{Expander} that introduces missing intermediate concepts using LLM world knowledge, and an evidence-guided \textit{Enricher} that integrates external structured historical resources to ensure faithfulness. Using the \textit{Twenty-Four Histories}, we construct a large-scale, domain-aware event taxonomy covering politics, military, diplomacy, and social life in ancient China. Extensive reference-free and reference-based evaluations demonstrate improved structural coherence and coverage, while further analysis shows that the resulting taxonomy supports cross-cultural alignment.
翻译:尽管大型语言模型(LLM)在许多任务上表现出色,但其在历史与文化推理方面的能力有限,尤其是在中国历史等非英语语境中。分类学结构为组织历史知识、增进理解提供了有效机制。然而,人工构建分类学成本高昂且难以扩展。因此,我们提出了 \textbf{CHisAgent},一个用于中国古代语境下历史分类学构建的多智能体 LLM 框架。CHisAgent 将分类学构建分解为三个角色专精的阶段:自底向上的 \textit{归纳器},从原始历史语料库中推导出初始层次结构;自顶向下的 \textit{扩展器},利用 LLM 的世界知识引入缺失的中间概念;以及证据引导的 \textit{丰富器},整合外部结构化历史资源以确保忠实性。基于《二十四史》,我们构建了一个大规模、领域感知的事件分类学,涵盖中国古代的政治、军事、外交和社会生活。大量的无参考和基于参考的评估表明,该框架提升了结构的连贯性与覆盖度,进一步分析显示,所生成的分类学支持跨文化对齐。