Responsible AI demands systems whose behavioral tendencies can be effectively measured, audited, and adjusted to prevent inadvertently nudging users toward risky decisions or embedding hidden biases in risk aversion. As language models (LMs) are increasingly incorporated into AI-driven decision support systems, understanding their risk behaviors is crucial for their responsible deployment. This study investigates the manipulability of risk aversion (MoRA) in LMs, examining their ability to replicate human risk preferences across diverse economic scenarios, with a focus on gender-specific attitudes, uncertainty, role-based decision-making, and the manipulability of risk aversion. The results indicate that while LMs such as DeepSeek Reasoner and Gemini-2.0-flash-lite exhibit some alignment with human behaviors, notable discrepancies highlight the need to refine bio-centric measures of manipulability. These findings suggest directions for refining AI design to better align human and AI risk preferences and enhance ethical decision-making. The study calls for further advancements in model design to ensure that AI systems more accurately replicate human risk preferences, thereby improving their effectiveness in risk management contexts. This approach could enhance the applicability of AI assistants in managing risk.
翻译:负责任的AI要求系统行为倾向能够被有效测量、审计和调整,以防止无意中引导用户做出风险决策或在风险规避中嵌入隐性偏见。随着语言模型(LMs)日益融入AI驱动的决策支持系统,理解其风险行为对于负责任地部署这些模型至关重要。本研究调查了语言模型中风险规避的可操纵性(MoRA),检验了它们在不同经济场景下复制人类风险偏好的能力,重点关注性别特定态度、不确定性、基于角色的决策以及风险规避的可操纵性。结果表明,尽管DeepSeek Reasoner和Gemini-2.0-flash-lite等语言模型表现出与人类行为的一定一致性,但显著的差异凸显了改进以生物为中心的可操纵性测量指标的必要性。这些发现为改进AI设计指明了方向,以更好地协调人类与AI的风险偏好并增强伦理决策。本研究呼吁在模型设计方面取得进一步进展,以确保AI系统更准确地复制人类风险偏好,从而提高其在风险管理情境中的有效性。这一方法可以增强AI助手在风险管理中的适用性。