Do different LLM architectures encode high-level concepts in structurally compatible ways? We systematically characterize a geometric-functional universality dissociation: across multiple concept domains and architectural families, moderate geometric convergence coexists with near-perfect functional transfer. Using contrastive-difference CKA (CKA_Delta), a training-free diagnostic that computes kernel alignment on per-sample contrastive differences, we isolate concept-specific convergence from generic similarity -- achieving significant discrimination where standard CKA cannot. The dissociation replicates across all six concept domains we test (five with p <= 0.017 geometric discrimination and safety as a converging-functional trend, p = 0.08), including two non-instruction concepts (code-vs-NL, reasoning-vs-recall) validated without system prompts; a single 70B--70B pair provides an observational note that universality may strengthen with scale, requiring replication with additional >=70B models. We position CKA_Delta as a practical regime classifier and architectural outlier detector (Gemma: d = 1.08, AUC = 0.79) rather than an absolute transfer-accuracy predictor, providing a training-free diagnostic for cross-architecture concept monitoring.
翻译:不同的LLM架构是否以结构兼容的方式编码高层概念?我们系统性地表征了一种几何-功能通用性分离现象:在多个概念领域和架构家族中,适度的几何收敛与近乎完美的功能迁移并存。通过使用对比差异CKA(CKA_Delta)——一种无需训练的诊断方法,该方法在每样本对比差异上计算核对齐——我们从通用相似性中分离出概念特定的收敛,实现了标准CKA无法达到的显著区分能力。这种分离在我们测试的所有六个概念领域中均可复现(其中五个领域的几何区分p <= 0.017,安全领域呈现收敛性功能趋势,p = 0.08),包括两个无需系统提示验证的非指令概念(代码与自然语言对比、推理与回忆对比);单个70B-70B配对提供了通用性可能随规模增强的观察性提示,需使用更多>=70B的模型进行验证。我们将CKA_Delta定位为一种实用的类别分类器和架构异常检测器(Gemma模型:d = 1.08,AUC = 0.79),而非绝对的迁移准确率预测器,从而为跨架构概念监控提供了一种无需训练的诊断工具。