As large language models (LLMs) are increasingly embedded in everyday decision-making, their safety responsibilities extend beyond reacting to explicit harmful intent toward anticipating unintended but consequential risks. In this work, we introduce a proactive risk awareness evaluation framework that measures whether LLMs can anticipate potential harms and provide warnings before damage occurs. We construct the Butterfly dataset to instantiate this framework in the environmental and ecological domain. It contains 1,094 queries that simulate ordinary solution-seeking activities whose responses may induce latent ecological impact. Through experiments across five widely used LLMs, we analyze the effects of response length, languages, and modality. Experimental results reveal consistent, significant declines in proactive awareness under length-restricted responses, cross-lingual similarities, and persistent blind spots in (multimodal) species protection. These findings highlight a critical gap between current safety alignment and the requirements of real-world ecological responsibility, underscoring the need for proactive safeguards in LLM deployment.
翻译:随着大型语言模型(LLMs)日益融入日常决策,其安全责任已从应对显性恶意意图,扩展到预测非故意但具有严重后果的风险。本研究提出了一种前瞻性风险意识评估框架,用于衡量LLMs是否能够预见潜在危害并在损害发生前提供预警。我们构建了Butterfly数据集,在环境与生态领域实例化该框架。该数据集包含1,094个查询,模拟了可能引发潜在生态影响的常规解决方案寻求活动。通过对五种广泛使用的LLMs进行实验,我们分析了响应长度、语言和多模态的影响。实验结果表明:在长度受限的响应下,前瞻性意识存在一致且显著的下降;跨语言场景呈现相似性;且在(多模态)物种保护方面存在持续盲点。这些发现揭示了当前安全对齐机制与现实生态责任要求之间的关键差距,强调了在LLM部署中建立前瞻性防护机制的必要性。