We posit that transforming similarity relations form the structural basis of comprehensible dynamic systems. This paper introduces Similarity Field Theory, a mathematical framework that formalizes the principles governing similarity values among entities and their evolution. We define: (1) a similarity field $S: U \times U \to [0,1]$ over a universe of entities $U$, satisfying reflexivity $S(E,E)=1$ and treated as a directed relational field (asymmetry and non-transitivity are allowed); (2) the evolution of a system through a sequence $Z_p=(X_p,S^{(p)})$ indexed by $p=0,1,2,\ldots$; (3) concepts $K$ as entities that induce fibers $F_α(K)={E\in U \mid S(E,K)\ge α}$, i.e., superlevel sets of the unary map $S_K(E):=S(E,K)$; and (4) a generative operator $G$ that produces new entities. Within this framework, we formalize a generative definition of intelligence: an operator $G$ is intelligent with respect to a concept $K$ if, given a system containing entities belonging to the fiber of $K$, it generates new entities that also belong to that fiber. Similarity Field Theory thus offers a foundational language for characterizing, comparing, and constructing intelligent systems. At a high level, this framework reframes intelligence and interpretability as geometric problems on similarity fields--preserving and composing level-set fibers--rather than statistical ones. We prove two theorems: (i) asymmetry blocks mutual inclusion; and (ii) stability implies either an anchor coordinate or asymptotic confinement to the target level (up to arbitrarily small tolerance). Together, these results constrain similarity-field evolution and motivate an interpretive lens applicable to large language models. AI systems may be aligned less to safety as such than to human-observable and human-interpretable conceptions of safety, which may not fully determine the underlying safety concept.
翻译:我们提出,变换的相似性关系构成了可理解的动态系统的结构基础。本文介绍了相似性场论(Similarity Field Theory),这是一个数学框架,用于形式化支配实体间相似性值及其演化的原理。我们定义了:(1)一个相似性场$S: U \times U \to [0,1]$,作用于实体宇宙$U$上,满足自反性$S(E,E)=1$,并被视为有向关系场(允许非对称性和非传递性);(2)系统通过由$p=0,1,2,\ldots$索引的序列$Z_p=(X_p,S^{(p)})$进行演化;(3)概念$K$作为实体,其诱导纤维$F_α(K)={E\in U \mid S(E,K)\ge α}$,即一元映射$S_K(E):=S(E,K)$的上水平集;(4)一个生成新实体的生成算子$G$。在此框架内,我们形式化了智能的生成性定义:如果一个算子$G$,给定一个包含属于概念$K$纤维的实体的系统,能生成也属于该纤维的新实体,则称$G$相对于概念$K$是智能的。因此,相似性场论为描述、比较和构建智能系统提供了一种基础性语言。在高层次上,该框架将智能和可解释性重新定义为相似性场上的几何问题——即保持和组合水平集纤维——而非统计问题。我们证明了两个定理:(i)非对称性阻碍相互包含;(ii)稳定性要么意味着锚定坐标,要么意味着渐近约束于目标水平(直至任意小的容差)。这些结果共同约束了相似性场的演化,并激发了一种适用于大型语言模型的解释视角。人工智能系统可能更少地直接对齐于安全本身,而是对齐于人类可观察和可解释的安全概念,这些概念可能并不完全决定底层的安全概念本身。