Can artificial intelligence discover, from raw experience and without human supervision, concepts that humans have discovered? One challenge is that human concepts themselves are fluid: conceptual boundaries can shift, split, and merge as inquiry progresses (e.g., Pluto is no longer considered a planet). To make progress, we need a definition of "concept" that is not merely a dictionary label, but a structure that can be revised, compared, and aligned across agents. We propose an algorithmic-information viewpoint that treats a concept as an information object defined only through its structural relation to an agent's total experience. The core constraint is determination: a set of parts forms a reversible consistency relation if any missing part is recoverable from the others (up to the standard logarithmic slack in Kolmogorov-style identities). This reversibility prevents "concepts" from floating free of experience and turns concept existence into a checkable structural claim. To judge whether a decomposition is natural, we define excess information, measuring the redundancy overhead introduced by splitting experience into multiple separately described parts. On top of these definitions, we formulate dialectics as an optimization dynamics: as new patches of information appear (or become contested), competing concepts bid to explain them via shorter conditional descriptions, driving systematic expansion, contraction, splitting, and merging. Finally, we formalize low-cost concept transmission and multi-agent alignment using small grounds/seeds that allow another agent to reconstruct the same concept under a shared protocol, making communication a concrete compute-bits trade-off.
翻译:人工智能能否从原始经验出发,在无人监督的情况下自主发现人类已揭示的概念?一个核心挑战在于人类概念本身具有流动性:随着探究的推进,概念的边界会发生迁移、分裂与融合(例如冥王星不再被视为行星)。要取得进展,我们需要一种超越词典标签的“概念”定义,将其视为一种可修正、可比较且能在智能体间对齐的结构。我们提出一种算法信息视角,将概念视为仅通过其与智能体整体经验的结构关系定义的信息对象。其核心约束是决定性:一组部分构成可逆一致性关系,当且仅当任意缺失部分皆可从其余部分恢复(在科尔莫戈洛夫式恒等式允许的对数松弛范围内)。这种可逆性防止“概念”脱离经验而漂浮,并将概念存在性转化为可验证的结构性主张。为评判分解是否自然,我们定义超额信息量,用以衡量将经验拆分为多个独立描述部分时引入的冗余开销。基于这些定义,我们将辩证法形式化为一种优化动力学:当新的信息片段出现(或产生争议)时,相互竞争的概念通过更短的条件下描述来竞相解释它们,从而驱动系统的扩展、收缩、分裂与融合。最后,我们利用小型基础/种子形式化低成本概念传递与多智能体对齐机制,使其他智能体能在共享协议下重构相同概念,从而将通信转化为具体的计算-比特权衡问题。