AIvilization v0: Toward Large-Scale Artificial Social Simulation with a Unified Agent Architecture and Adaptive Agent Profiles

AIvilization v0 is a publicly deployed large-scale artificial society that couples a resource-constrained sandbox economy with a unified LLM-agent architecture, aiming to sustain long-horizon autonomy while remaining executable under rapidly changing environment. To mitigate the tension between goal stability and reactive correctness, we introduce (i) a hierarchical branch-thinking planner that decomposes life goals into parallel objective branches and uses simulation-guided validation plus tiered re-planning to ensure feasibility; (ii) an adaptive agent profile with dual-process memory that separates short-term execution traces from long-term semantic consolidation, enabling persistent yet evolving identity; and (iii) a human-in-the-loop steering interface that injects long-horizon objectives and short commands at appropriate abstraction levels, with effects propagated through memory rather than brittle prompt overrides. The environment integrates physiological survival costs, non-substitutable multi-tier production, an AMM-based price mechanism, and a gated education-occupation system. Using high-frequency transactions from the platforms mature phase, we find stable markets that reproduce key stylized facts (heavy-tailed returns and volatility clustering) and produce structured wealth stratification driven by education and access constraints. Ablations show simplified planners can match performance on narrow tasks, while the full architecture is more robust under multi-objective, long-horizon settings, supporting delayed investment and sustained exploration.

翻译：AIvilization v0 是一个公开部署的大规模人工社会系统，它将资源受限的沙盒经济与统一的LLM智能体架构相耦合，旨在维持长期自主性的同时，在快速变化的环境中保持可执行性。为缓解目标稳定性与反应正确性之间的张力，我们引入：（i）一种分层分支思维规划器，将生活目标分解为并行目标分支，并采用模拟引导验证与分层重规划机制确保可行性；（ii）一种具有双过程记忆的自适应智能体画像，将短期执行轨迹与长期语义整合相分离，从而实现持续且动态演化的身份表征；（iii）一种人在回路的引导接口，可在适当抽象层级注入长期目标与短期指令，其影响通过记忆系统传播而非脆弱的提示词覆盖。环境系统整合了生理生存成本、不可替代的多层级生产、基于自动做市商（AMM）的价格机制以及门控式教育-职业体系。基于平台成熟期的高频交易数据，我们发现了再现关键典型事实（厚尾收益与波动率聚集）的稳定市场，并观察到由教育与准入约束驱动的结构化财富分层。消融实验表明，简化规划器在单一任务上可达到相近性能，而完整架构在多目标、长周期场景下更具鲁棒性，能够支持延迟投资与持续探索行为。