AI impact assessments often stress near-term risks because human judgment degrades over longer horizons, exemplifying the Collingridge dilemma: foresight is most needed when knowledge is scarcest. To address long-term systemic risks, we introduce a scalable approach that simulates in-silico agents using the strategic foresight method of the Futures Wheel. We applied it to four AI uses spanning Technology Readiness Levels (TRLs): Chatbot Companion (TRL 9, mature), AI Toy (TRL 7, medium), Griefbot (TRL 5, low), and Death App (TRL 2, conceptual). Across 30 agent runs per use, agents produced 86-110 consequences, condensed into 27-47 unique risks. To benchmark the agent outputs against human perspectives, we collected evaluations from 290 domain experts and 7 leaders, and conducted Futures Wheel sessions with 42 experts and 42 laypeople. Agents generated many systemic consequences across runs. Compared with these outputs, experts identified fewer risks, typically less systemic but judged more likely, whereas laypeople surfaced more emotionally salient concerns that were generally less systemic. We propose a hybrid foresight workflow, wherein agents broaden systemic coverage, and humans provide contextual grounding. Our dataset is available at: https://social-dynamics.net/ai-risks/foresight.
翻译:人工智能影响评估通常侧重于短期风险,因为人类判断力会随预测时间跨度的延长而衰减,这体现了科林里奇困境:最需要前瞻性洞察的时刻,恰恰是知识最为匮乏之时。为应对长期系统性风险,我们提出一种可扩展的方法,该方法利用“未来之轮”战略前瞻方法,通过计算机模拟智能体进行推演。我们将此方法应用于四个覆盖不同技术就绪水平的人工智能用例:聊天机器人伴侣(TRL 9,成熟)、AI玩具(TRL 7,中等)、哀悼机器人(TRL 5,较低)和数字遗产应用(TRL 2,概念阶段)。针对每个用例进行30轮智能体模拟,智能体共生成86至110项潜在后果,经归纳后形成27至47项独特风险。为将智能体输出与人类视角进行基准比较,我们收集了来自290名领域专家和7位行业领袖的评估,并与42名专家及42名非专业人士分别开展了“未来之轮”研讨。智能体在多轮模拟中产生了大量系统性后果。与此相比,专家识别出的风险数量较少,通常系统性较低但被判定为更可能发生;而非专业人士则提出了更多情感上突出的关切,这些关切总体上系统性较弱。我们提出一种混合式前瞻工作流程,其中智能体负责拓宽系统性风险的覆盖范围,而人类则提供情境化的基础判断。我们的数据集公开于:https://social-dynamics.net/ai-risks/foresight。