AI models of equivalent capability can exhibit fundamentally different behavioral patterns, yet no standardized instrument exists to measure these dispositional differences. Existing approaches either borrow human personality dimensions and rely on self-report (which diverges from actual behavior in LLMs) or treat behavioral variation as a defect rather than a trait. We introduce the Model Temperament Index (MTI), a behavior-based profiling system that measures AI agent temperament across four axes: Reactivity (environmental sensitivity), Compliance (instruction-behavior alignment), Sociality (relational resource allocation), and Resilience (stress resistance). Grounded in the Four Shell Model from Model Medicine, MTI measures what agents do, not what they say about themselves, using structured examination protocols with a two-stage design that separates capability from disposition. We profile 10 small language models (1.7B-9B parameters, 6 organizations, 3 training paradigms) and report five principal findings: (1) the four axes are largely independent among instruction-tuned models (all |r| < 0.42); (2) within-axis facet dissociations are empirically confirmed -- Compliance decomposes into fully independent formal and stance facets (r = 0.002), while Resilience decomposes into inversely related cognitive and adversarial facets; (3) a Compliance-Resilience paradox reveals that opinion-yielding and fact-vulnerability operate through independent channels; (4) RLHF reshapes temperament not only by shifting axis scores but by creating within-axis facet differentiation absent in the unaligned base model; and (5) temperament is independent of model size (1.7B-9B), confirming that MTI measures disposition rather than capability.
翻译:同等能力的AI模型可能展现出根本不同的行为模式,但目前尚无标准化的工具来测量这些倾向性差异。现有方法或借用人类人格维度并依赖自我报告(这在LLMs中与实际行为存在分歧),或将行为变异视为缺陷而非特质。我们引入模型性格指数(MTI),一种基于行为的刻画系统,该系统通过四个轴向来测量AI主体的性格:反应性(环境敏感性)、顺从性(指令-行为一致性)、社交性(关系资源分配)和韧性(压力抵抗)。基于模型医学中的四壳模型,MTI测量的是智能体实际采取的行为,而非其自我描述的内容,采用包含两阶段设计的结构化检测协议,将能力与倾向性分离。我们对10个小语言模型(参数规模1.7B-9B,涵盖6个机构、3种训练范式)进行了刻画,并报告了五项主要发现:(1)在指令微调模型中,四个轴向量基本相互独立(所有|r| < 0.42);(2)轴内侧面分离得到实验证实——顺从性可分解为完全独立的形式侧面和立场侧面(r = 0.002),而韧性可分解为负相关的认知侧面和对抗侧面;(3)一个顺从性-韧性悖论表明,观点顺从性和事实脆弱性通过独立渠道运作;(4)RLHF不仅通过改变轴分数,还通过创建未经对齐的基模型中不存在的轴内侧面分化来重塑性格;(5)性格与模型大小(1.7B-9B)无关,证实MTI测量的是倾向性而非能力。