Language change both reflects and shapes social processes, and the semantic evolution of foundational concepts provides a measurable trace of historical and social transformation. Despite recent advances in diachronic semantics and discourse analysis, existing computational approaches often (i) concentrate on a single concept or a single corpus, making findings difficult to compare across heterogeneous sources, and (ii) remain confined to surface lexical evidence, offering insufficient computational and interpretive granularity when concepts are expressed implicitly. We propose HistLens, a unified, SAE-based framework for multi-concept, multi-corpus conceptual-history analysis. The framework decomposes concept representations into interpretable features and tracks their activation dynamics over time and across sources, yielding comparable conceptual trajectories within a shared coordinate system. Experiments on long-span press corpora show that HistLens supports cross-concept, cross-corpus computation of patterns of idea evolution and enables implicit concept computation. By bridging conceptual modeling with interpretive needs, HistLens broadens the analytical perspectives and methodological repertoire available to social science and the humanities for diachronic text analysis.
翻译:语言变化既反映又塑造社会进程,基础概念的语义演变为历史与社会转型提供了可量化的痕迹。尽管历时语义学与话语分析近期取得进展,现有计算方法往往(i)聚焦于单一概念或单一语料库,使得不同异质来源的研究结果难以比较;(ii)固于表层词汇证据,在概念隐含表达时缺乏足够的计算与解释粒度。我们提出HistLens——一个统一的、基于SAE的多概念、多语料库概念史分析框架。该框架将概念表征分解为可解释的特征,并追踪这些特征在不同时间与来源中的激活动态,从而在共享坐标系中生成可比较的概念轨迹。基于长期跨度新闻语料库的实验表明,HistLens支持跨概念、跨语料库的观念演化模式计算,并实现隐含概念的计算。通过桥接概念建模与解释需求,HistLens拓展了社会科学与人文学科在历时文本分析中的分析视角与方法论工具。