Modern AI systems exhibit structural failures that capability scaling alone does not reliably fix: they optimize under-specified objectives with no architectural mechanism to question whether the objective should be optimized at all. Engagement maximization can amplify harmful pathways; tool-using agents can commit irreversible actions; preference-trained language models can become sycophantic. We argue that this failure is a wisdom problem, not an intelligence problem. We use "wisdom" in a deliberately architectural sense, not as a claim about virtue, consciousness, or moral omniscience. Intelligence accepts a goal and optimizes within it; wisdom interrogates whether the goal should be optimized at all. The two are separable architectural properties. We propose architectural wisdom as a corrigible objective-governance layer above the optimization substrate. The layer makes three structural commitments explicit and nondegenerate before any action: temporal horizon, relational boundary, and irreversibility. It is realized by four components (Structural Utility Transform, Moral Admissibility Interface, Arbitration and Escalation Controller, Value Revision Channel) that compute a six-coordinate wisdom tuple over horizon, relational coverage, irreversibility, admissibility, value revision, and auditability. We motivate the architecture by eight cases drawn from contemporary AI failures, secular wisdom traditions, and hard ethical situations, and defend the distinction against the intelligence-completeness thesis using goal-questioning over goal-taking, Bostrom's orthogonality, structural separation in our exemplar cases, and persistent failure modes despite capability scaling. The framework is the conceptual contract for a larger architecture whose formal specifications and empirical validation are developed in subsequent work.
翻译:现代AI系统存在仅凭能力扩展无法可靠修复的结构性缺陷:它们在缺乏架构性机制以质疑目标本身是否应被优化的情况下,对未充分定义的目标进行优化。参与度最大化可能放大有害路径;工具使用型智能体可能实施不可逆行为;基于偏好训练的语言模型可能产生谄媚问题。我们认为,这种失败本质上是"慧识"问题,而非智能问题。此处"慧识"取严格架构性含义,不涉及美德、意识或道德全知等概念。智能接受目标并在此框架内进行优化;慧识则拷问目标本身是否应被优化。两者是可分离的架构属性。我们提出"架构慧识"作为位于优化基底之上的可修正目标治理层。该层在行动执行前显式声明并保持三重结构性约束:时间视阈、关系边界与不可逆性。其实现依赖四个组件(结构效用转换器、道德可接受性接口、仲裁与升级控制器、价值修正通道),通过计算包含视阈、关系覆盖、不可逆性、可接受性、价值修正与可审计性六个坐标的慧识元组。我们通过八个案例(涵盖当代AI失效案例、世俗智慧传统及伦理困境)论证该架构的动机,并基于目标质疑与目标接受的区分、博斯特罗姆正交性命题、典型案例中的结构分离现象,以及能力扩展下持续存在的失效模式,论证该区分独立于"智能完备性假说"。本框架为更大架构体系的概念性蓝图,其形式化规范与实证验证将在后续研究中展开。