Personality Requires Struggle: Three Regimes of the Baldwin Effect in Neuroevolved Chess Agents

Can lifetime learning expand behavioral diversity over evolutionary time, rather than collapsing it? Prior theory predicts that plasticity reduces variance by buffering organisms against environmental noise. We test this in a competitive domain: chess agents with eight NEAT-evolved neural modules, Hebbian within-game plasticity, and a desirability-domain signal chain with imagination. Across 10~seeds per Hebbian condition, a variance crossover emerges: Hebbian ON starts with lower cross-seed variance than OFF, then surpasses it at generation~34. The crossover trend is monotonic (\r{ho} = 0.91, p < 10^{-6): plasticity's effect on behavioral variance reverses over evolutionary time, initially compressing diversity (consistent with prior predictions) then expanding it as evolved Perception differences are amplified through imagination -- a feedback loop that mutation alone cannot sustain. The result is structured behavioral divergence: evolved agents select different moves on the same positions (62\% disagreement), develop distinct opening repertoires, piece preferences, and game lengths. These are not different sampling policies -- they are reproducible behavioral signatures (ICC > 0.8) with interpretable signal chain configurations. Three regimes appear depending on opponent type: exploration (Hebbian ON, heterogeneous opponent), lottery (Hebbian OFF, elitism lock-in), and transparent (same-model opponent, brain self-erasure). The transparent regime generates a falsifiable prediction: self-play systems may systematically suppress behavioral diversity by eliminating the heterogeneity that personality requires. \textbf{Keywords: Baldwin Effect, neuroevolution, NEAT, Hebbian learning, chess, cognitive architecture, personality emergence, imagination

翻译：毕生学习能否在进化时间尺度上扩展而非收缩行为多样性？先前理论预测，可塑性通过缓冲环境噪声来降低变异度。我们在竞争性领域对此进行检验：采用八个NEAT进化神经模块、赫布型局内可塑性以及具备想象能力的期望-域信号链的象棋智能体。在每种赫布条件下进行10次随机种子实验后，发现方差交叉现象：赫布开启状态初始的跨种子方差低于关闭状态，但在第34代后超越后者。该交叉趋势呈单调性(ρ=0.91, p<10⁻⁶)：可塑性对行为方差的影响随进化时间发生逆转——初期压缩多样性（与先前预测一致），随后因进化感知差异通过想象（突变无法单独维持的正反馈回路）被放大而扩展多样性。最终形成结构化的行为分化：进化智能体在相同棋局中选择不同走法（62%分歧度），发展出独特开局库、棋子偏好及对局时长。这并非不同采样策略——而是具有可解释信号链构型的可复现行为特征（ICC>0.8）。根据对手类型出现三种机制：探索期（赫布开启，异质对手）、抽奖期（赫布关闭，精英锁定期）与透明期（同模型对手，大脑自擦除期）。透明期生成可证伪预测：自我对弈系统可能通过消除个性所需的异质性而系统性地抑制行为多样性。关键词：鲍德温效应、神经进化、NEAT、赫布学习、国际象棋、认知架构、个性涌现、想象