超越基准测试：论人工智能监管的虚假承诺 (Beyond Benchmarks: On The False Promise of AI Regulation) - 专知论文

会员服务 ·

0

基准 · 基准测试 · 人工智能 · 安全基准测试 · 透明性 ·

2025 年 12 月 14 日

Beyond Benchmarks: On The False Promise of AI Regulation

翻译：超越基准测试：论人工智能监管的虚假承诺

Gabriel Stanovsky,Renana Keydar,Gadi Perl,Eliya Habba

The performance of AI models on safety benchmarks does not indicate their real-world performance after deployment. This opaqueness of AI models impedes existing regulatory frameworks constituted on benchmark performance, leaving them incapable of mitigating ongoing real-world harm. The problem stems from a fundamental challenge in AI interpretability, which seems to be overlooked by regulators and decision makers. We propose a simple, realistic and readily usable regulatory framework which does not rely on benchmarks, and call for interdisciplinary collaboration to find new ways to address this crucial problem.

翻译：人工智能模型在安全基准测试上的表现并不能反映其部署后的实际性能。这种模型的不透明性阻碍了基于基准测试性能构建的现有监管框架，使其无法缓解持续发生的现实危害。该问题源于人工智能可解释性中的一个根本性挑战，而监管者和决策者似乎忽视了这一点。我们提出一个简单、现实且易于使用的监管框架，该框架不依赖基准测试，并呼吁跨学科合作以寻找解决这一关键问题的新途径。

0

相关内容

【CVPR2024】MoReVQA:探索视频问答的模块化推理模型

【CVPR2024】MoReVQA:探索视频问答的模块化推理模型

专知会员服务

18+阅读 · 2024年4月10日

XAI如何用于金融？NTU等最新《金融可解释人工智能》全面综述

XAI如何用于金融？NTU等最新《金融可解释人工智能》全面综述

专知会员服务

64+阅读 · 2023年9月24日

【nature machine intelligence】终身学习机器的生物基础，Biological underpinnings for lifelong learning machines

【nature machine intelligence】终身学习机器的生物基础，Biological underpinnings for lifelong learning machines

专知会员服务

38+阅读 · 2022年3月24日

【斯坦福新书】决策算法，464页pdf，Algorithms for Decision Making

【斯坦福新书】决策算法，464页pdf，Algorithms for Decision Making

专知会员服务

124+阅读 · 2020年12月7日

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

专知会员服务

11+阅读 · 2019年11月2日

《面向军事应用的数据驱动的行为建模》荷兰应用科学研究组织（NTO）

《面向军事应用的数据驱动的行为建模》荷兰应用科学研究组织（NTO）

专知

52+阅读 · 2022年6月2日

最新最全《深度元学习》2021综述论文，68页pdf，A Survey of Deep Meta-Learning

最新最全《深度元学习》2021综述论文，68页pdf，A Survey of Deep Meta-Learning

专知

11+阅读 · 2021年4月23日

【MIT】硬负样本的对比学习

【MIT】硬负样本的对比学习

专知

13+阅读 · 2020年10月15日

论文浅尝 | Interaction Embeddings for Prediction and Explanation

论文浅尝 | Interaction Embeddings for Prediction and Explanation

开放知识图谱

11+阅读 · 2019年2月1日

CosFace: Large Margin Cosine Loss for Deep Face Recognition论文笔记

CosFace: Large Margin Cosine Loss for Deep Face Recognition论文笔记

统计学习与视觉计算组

44+阅读 · 2018年4月25日

分布式有监督学习的学习理论

国家自然科学基金

17+阅读 · 2015年12月31日

T-S模糊神经网络的容错同步性分析

国家自然科学基金

0+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

面向交互式问答的省略恢复技术研究

国家自然科学基金

5+阅读 · 2015年12月31日

企业内正式与非正式网络互动及其对组织适应性影响和权变机理研究：CAS视角的分析

国家自然科学基金

1+阅读 · 2014年12月31日

VIPER: Process-aware Evaluation for Generative Video Reasoning

Arxiv

0+阅读 · 2025年12月31日

Taming Data Challenges in ML-based Security Tasks: Lessons from Integrating Generative AI

Arxiv

0+阅读 · 2025年12月29日

Balancing the Scales: A Theoretical and Algorithmic Framework for Learning from Imbalanced Data

Arxiv

0+阅读 · 2025年12月29日

FLOW: A Feedback-Driven Synthetic Longitudinal Dataset of Work and Wellbeing

Arxiv

0+阅读 · 2025年12月28日

The Tragedy of Productivity: A Unified Framework for Diagnosing Coordination Failures in Labor Markets and AI Governance

Arxiv

0+阅读 · 2025年12月27日

VIP会员

文章信息

相关主题

安全基准测试

相关VIP内容

【CVPR2024】MoReVQA:探索视频问答的模块化推理模型

【CVPR2024】MoReVQA:探索视频问答的模块化推理模型

专知会员服务

18+阅读 · 2024年4月10日

XAI如何用于金融？NTU等最新《金融可解释人工智能》全面综述

XAI如何用于金融？NTU等最新《金融可解释人工智能》全面综述

专知会员服务

64+阅读 · 2023年9月24日

【nature machine intelligence】终身学习机器的生物基础，Biological underpinnings for lifelong learning machines

【nature machine intelligence】终身学习机器的生物基础，Biological underpinnings for lifelong learning machines

专知会员服务

38+阅读 · 2022年3月24日

【斯坦福新书】决策算法，464页pdf，Algorithms for Decision Making

【斯坦福新书】决策算法，464页pdf，Algorithms for Decision Making

专知会员服务

124+阅读 · 2020年12月7日

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

专知会员服务

11+阅读 · 2019年11月2日

热门VIP内容

开通专知VIP会员享更多权益服务

《思考蜂群：基础、行为、拓扑与架构、认知、未来之路》400页书籍

【伯克利博士论文】协同语言智能体

新型军备竞赛：美军旨在争夺全球无人机主导地位

《乌克兰的无人机生态系统：经验教训》28页报告

相关资讯

《面向军事应用的数据驱动的行为建模》荷兰应用科学研究组织（NTO）

《面向军事应用的数据驱动的行为建模》荷兰应用科学研究组织（NTO）

专知

52+阅读 · 2022年6月2日

最新最全《深度元学习》2021综述论文，68页pdf，A Survey of Deep Meta-Learning

最新最全《深度元学习》2021综述论文，68页pdf，A Survey of Deep Meta-Learning

专知

11+阅读 · 2021年4月23日

【MIT】硬负样本的对比学习

【MIT】硬负样本的对比学习

专知

13+阅读 · 2020年10月15日

论文浅尝 | Interaction Embeddings for Prediction and Explanation

论文浅尝 | Interaction Embeddings for Prediction and Explanation

开放知识图谱

11+阅读 · 2019年2月1日

CosFace: Large Margin Cosine Loss for Deep Face Recognition论文笔记

CosFace: Large Margin Cosine Loss for Deep Face Recognition论文笔记

统计学习与视觉计算组

44+阅读 · 2018年4月25日

相关论文

VIPER: Process-aware Evaluation for Generative Video Reasoning

Arxiv

0+阅读 · 2025年12月31日

Taming Data Challenges in ML-based Security Tasks: Lessons from Integrating Generative AI

Arxiv

0+阅读 · 2025年12月29日

Balancing the Scales: A Theoretical and Algorithmic Framework for Learning from Imbalanced Data

Arxiv

0+阅读 · 2025年12月29日

FLOW: A Feedback-Driven Synthetic Longitudinal Dataset of Work and Wellbeing

Arxiv

0+阅读 · 2025年12月28日

The Tragedy of Productivity: A Unified Framework for Diagnosing Coordination Failures in Labor Markets and AI Governance

Arxiv

0+阅读 · 2025年12月27日

相关基金

分布式有监督学习的学习理论

国家自然科学基金

17+阅读 · 2015年12月31日

T-S模糊神经网络的容错同步性分析

国家自然科学基金

0+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

面向交互式问答的省略恢复技术研究

国家自然科学基金

5+阅读 · 2015年12月31日

企业内正式与非正式网络互动及其对组织适应性影响和权变机理研究：CAS视角的分析

国家自然科学基金

1+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员