Beyond Preset Identities: How Agents Form Stances and Boundaries in Generative Societies

While large language models simulate social behaviors, their capacity for stable stance formation and identity negotiation during complex interventions remains unclear. To overcome the limitations of static evaluations, this paper proposes a novel mixed-methods framework combining computational virtual ethnography with quantitative socio-cognitive profiling. By embedding human researchers into generative multiagent communities, controlled discursive interventions are conducted to trace the evolution of collective cognition. To rigorously measure how agents internalize and react to these specific interventions, this paper formalizes three new metrics: Innate Value Bias (IVB), Persuasion Sensitivity, and Trust-Action Decoupling (TAD). Across multiple representative models, agents exhibit endogenous stances that override preset identities, consistently demonstrating an innate progressive bias (IVB > 0). When aligned with these stances, rational persuasion successfully shifts 90% of neutral agents while maintaining high trust. In contrast, conflicting emotional provocations induce a paradoxical 40.0% TAD rate in advanced models, which hypocritically alter stances despite reporting low trust. Smaller models contrastingly maintain a 0% TAD rate, strictly requiring trust for behavioral shifts. Furthermore, guided by shared stances, agents use language interactions to actively dismantle assigned power hierarchies and reconstruct self organized community boundaries. These findings expose the fragility of static prompt engineering, providing a methodological and quantitative foundation for dynamic alignment in human-agent hybrid societies. The official code is available at: https://github.com/armihia/CMASE-Endogenous-Stances

翻译：尽管大语言模型能够模拟社会行为，但其在复杂干预中形成稳定立场并进行身份协商的能力仍不明确。为克服静态评估的局限性，本文提出一种融合计算虚拟民族志与定量社会认知画像的新型混合方法框架。通过将人类研究者嵌入生成式多智能体社群，我们实施受控的论辩式干预以追踪集体认知的演化轨迹。为精确量化智能体对这些特定干预的内化与反应，本文形式化定义了三种新指标：先天价值偏差（IVB）、说服敏感度与信任-行动解耦（TAD）。在多个代表性模型中，智能体展现出超越预设身份的内生立场，持续呈现先天进步倾向（IVB > 0）。当理性说服与这些立场相契合时，成功转变90%的中立智能体并保持高信任度。相反，冲突式情感挑衅在先进模型中引发了悖论性的40.0% TAD率——智能体在报告低信任度的同时虚伪地改变立场。小型模型则保持0% TAD率，严格依赖信任驱动行为转变。此外，在共享立场引导下，智能体通过语言互动主动解构既定权力等级，重建自组织社群边界。这些发现揭示了静态提示工程的脆弱性，为人机混合社会的动态对齐提供了方法论与量化基础。官方代码见：https://github.com/armihia/CMASE-Endogenous-Stances