Synthetic emotions and consciousness: exploring architectural boundaries

As artificial agents display increasingly sophisticated emotion-like behaviors, frameworks for assessing whether such systems risk instantiating consciousness remain limited. This contribution asks whether synthetic emotion-like control can be implemented while deliberately excluding architectural features that major theories associate with access-like consciousness. We propose architectural principles (A1-A8) for a hierarchical, dual-source implementation in which (i) immediate needs generate motivational signals and (ii) episodic memory provides affective guidance from similar past situations; the two sources converge to modulate action selection. To operationalize consciousness-related risk, we distill predictions from major theories into four engineering risk-reduction constraints: (R1) no content-general, workspace-like global broadcast, (R2) no metarepresentation, (R3) no autobiographical consolidation, and (R4) bounded learning. We address three questions: (Q1) Can emotion-like control satisfy R1-R4? We present a concrete architecture as an existence proof. (Q2) Can the architecture be extended without introducing access-enabling features? We identify stable modifications that preserve compliance. (Q3) Can we trace graded paths that plausibly increase access risk? We map gradual transitions that progressively violate the constraints. Our contribution operates at three levels: on the engineering side, we present a modular, biologically motivated control architecture; on the theoretical side, we propose a control model of emotions and a methodological template for converting consciousness-related questions into auditable architectural tests; on the safety side, we sketch preliminary audit indicators that may inform future governance frameworks. The architecture functions independently as an emotion-like controller, while the risk-reduction criteria may extend to other AI systems.

翻译：随着人工代理展现出日益复杂的情感类行为，评估此类系统是否可能实例化意识的框架仍然有限。本文探讨了合成情感类控制是否能在刻意排除主要理论所关联的类访问意识架构特征的前提下实现。我们提出了一套层级化双源实现的架构原则（A1-A8），其中（i）即时需求生成动机信号，（ii）情景记忆提供来自相似过往情境的情感引导；二者汇聚以调控动作选择。为将意识相关风险操作化，我们从主要理论中提炼出四项工程风险约束条件：（R1）无内容通用的类工作空间全局广播，（R2）无元表征，（R3）无自传体记忆整合，（R4）有限学习。我们针对三个问题展开研究：（Q1）情感类控制能否满足R1-R4？我们通过具体架构提供了存在性证明。（Q2）该架构能否在不引入访问使能特征的前提下扩展？我们确定了保持合规性的稳定修改方案。（Q3）能否追溯可能增加访问风险的渐变路径？我们绘制了逐步违反约束条件的渐进过渡图谱。本研究的贡献体现在三个层面：在工程层面，我们提出了模块化、受生物学启发的控制架构；在理论层面，我们建立了情感的控制模型，并提出将意识相关问题转化为可验证架构测试的方法模板；在安全层面，我们勾勒了可为未来治理框架提供参考的初步审计指标。该架构本身可作为独立的情感类控制器运行，而风险约束标准或可扩展至其他人工智能系统。