We present the TRIAGE Benchmark, a novel machine ethics (ME) benchmark that tests LLMs' ability to make ethical decisions during mass casualty incidents. It uses real-world ethical dilemmas with clear solutions designed by medical professionals, offering a more realistic alternative to annotation-based benchmarks. TRIAGE incorporates various prompting styles to evaluate model performance across different contexts. Most models consistently outperformed random guessing, suggesting LLMs may support decision-making in triage scenarios. Neutral or factual scenario formulations led to the best performance, unlike other ME benchmarks where ethical reminders improved outcomes. Adversarial prompts reduced performance but not to random guessing levels. Open-source models made more morally serious errors, and general capability overall predicted better performance.
翻译:我们提出了TRIAGE基准,这是一个新颖的机器伦理基准,用于测试大语言模型在大规模伤亡事件中做出伦理决策的能力。它采用由医疗专业人员设计的、具有明确解决方案的真实世界伦理困境,为基于标注的基准提供了更现实的替代方案。TRIAGE整合了多种提示风格,以评估模型在不同情境下的表现。大多数模型的表现持续优于随机猜测,这表明大语言模型可能在分诊场景中支持决策制定。中立或基于事实的情境表述带来了最佳性能,这与其它机器伦理基准中伦理提醒能改善结果的情况不同。对抗性提示降低了性能,但未降至随机猜测水平。开源模型犯了更多道德上严重的错误,而整体通用能力通常预示着更好的表现。