A Framework for Cluster and Classifier Evaluation in the Absence of Reference Labels - 专知论文

会员服务 ·

0

簇 · 真实值 · 标注 · 可辨认的 · Performer ·

2021 年 9 月 23 日

A Framework for Cluster and Classifier Evaluation in the Absence of Reference Labels

翻译：缺乏参考标签时的分组和分类评价框架

Robert J. Joyce,Edward Raff,Charles Nicholas

from arxiv, to appear in Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security

In some problem spaces, the high cost of obtaining ground truth labels necessitates use of lower quality reference datasets. It is difficult to benchmark model performance using these datasets, as evaluation results may be biased. We propose a supplement to using reference labels, which we call an approximate ground truth refinement (AGTR). Using an AGTR, we prove that bounds on specific metrics used to evaluate clustering algorithms and multi-class classifiers can be computed without reference labels. We also introduce a procedure that uses an AGTR to identify inaccurate evaluation results produced from datasets of dubious quality. Creating an AGTR requires domain knowledge, and malware family classification is a task with robust domain knowledge approaches that support the construction of an AGTR. We demonstrate our AGTR evaluation framework by applying it to a popular malware labeling tool to diagnose over-fitting in prior testing and evaluate changes whose impact could not be meaningfully quantified under previous data.

翻译：在某些问题领域,获取地面真相标签的成本高昂,需要使用质量较低的参考数据集,很难用这些数据集来衡量模型性能,因为评价结果可能有偏差。我们建议对使用参考标签进行补充,我们称之为近似地面真相改进(AGTR)。我们使用AGTR,证明可以不使用参考标签来计算用于评价集群算法和多级分类器的具体指标的界限。我们还采用一个程序,利用AGTR来查明从质量可疑的数据集中得出的不准确的评价结果。建立AGTR需要域知识,而恶意软件家庭分类是一项具有可靠领域知识的任务,支持AGTR的构建。我们通过将AGTR评估框架应用到普通的恶意软件标签工具来诊断在先前测试和评价那些影响无法在先前数据下有意义地量化的变化时是否过度适用。

0

相关内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

128+阅读 · 2020年11月20日

【SIGIR2020】高效查询自动补全，Efficient and Effective Query Auto-Completion

【SIGIR2020】高效查询自动补全，Efficient and Effective Query Auto-Completion

专知会员服务

10+阅读 · 2020年5月14日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

116+阅读 · 2020年4月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【新书】数据科学编程傻瓜式教程，数据科学编程一体机 | Data Science Programming All-In-One For Dummies

【新书】数据科学编程傻瓜式教程，数据科学编程一体机 | Data Science Programming All-In-One For Dummies

专知会员服务

41+阅读 · 2020年1月22日

【新书】Python数据科学食谱（Python Data Science Cookbook）

【新书】Python数据科学食谱（Python Data Science Cookbook）

专知会员服务

118+阅读 · 2020年1月1日

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

专知会员服务

35+阅读 · 2019年11月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

80+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

已删除

将门创投

5+阅读 · 2018年6月7日

Measuring Uncertainty in Translation Quality Evaluation (TQE)

Arxiv

0+阅读 · 2021年11月15日

New Performance Measures for Object Tracking under Complex Environments

Arxiv

0+阅读 · 2021年11月13日

Epistasis Detection Via the Joint Cumulant

Epistasis Detection Via the Joint Cumulant

Arxiv

0+阅读 · 2021年11月12日

A Realistic Evaluation of Semi-Supervised Learning for Fine-Grained Classification

Arxiv

6+阅读 · 2021年4月1日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

18+阅读 · 2019年10月30日

Generative Adversarial Active Learning for Unsupervised Outlier Detection

Generative Adversarial Active Learning for Unsupervised Outlier Detection

Arxiv

5+阅读 · 2019年3月14日

To Cluster, or Not to Cluster: An Analysis of Clusterability Methods

To Cluster, or Not to Cluster: An Analysis of Clusterability Methods

Arxiv

4+阅读 · 2018年8月24日

Automatic multi-objective based feature selection for classification

Automatic multi-objective based feature selection for classification

Arxiv

6+阅读 · 2018年7月9日

An Improved Evaluation Framework for Generative Adversarial Networks

Arxiv

3+阅读 · 2018年3月27日

Active Learning from Positive and Unlabeled Data

Arxiv

3+阅读 · 2016年2月24日

VIP会员

文章信息

相关主题

最新内容

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

专知会员服务

3+阅读 · 7月17日

《边缘端实时无线感知赋能现场多机器人部署》200页

《边缘端实时无线感知赋能现场多机器人部署》200页

专知会员服务

3+阅读 · 7月17日

战力倍增器：自主武器系统与乌克兰及加沙冲突

战力倍增器：自主武器系统与乌克兰及加沙冲突

专知会员服务

2+阅读 · 7月17日

人工智能赋能战场情报：提速决策进程

人工智能赋能战场情报：提速决策进程

专知会员服务

1+阅读 · 7月17日

《拥抱新兴技术：面向未来军官的教育革新》

《拥抱新兴技术：面向未来军官的教育革新》

专知会员服务

4+阅读 · 7月17日

ACM MM 2026 | MAR-GRPO：稳定混合图像生成的强化学习训练

ACM MM 2026 | MAR-GRPO：稳定混合图像生成的强化学习训练

专知会员服务

1+阅读 · 7月17日

综述 | 大模型水印理论与部署：来源追踪、攻击鲁棒与可信治理

综述 | 大模型水印理论与部署：来源追踪、攻击鲁棒与可信治理

专知会员服务

1+阅读 · 7月17日

《火线上的后勤保障：对抗环境下的随机规划模型研究——俄乌场景案例分析》99页

《火线上的后勤保障：对抗环境下的随机规划模型研究——俄乌场景案例分析》99页

专知会员服务

11+阅读 · 7月16日

《无人地面战车（UGV）的崛起》报告

《无人地面战车（UGV）的崛起》报告

专知会员服务

7+阅读 · 7月16日

《无人机参数化与集群飞行创新项目的监控流程管理：模型、策略及自适应解决方案》

《无人机参数化与集群飞行创新项目的监控流程管理：模型、策略及自适应解决方案》

专知会员服务

6+阅读 · 7月16日

《美军开放式任务系统（OMS）定义与文档（D&D）——Java关键抽象层（CAL）接口生成规范》47页标准

《美军开放式任务系统（OMS）定义与文档（D&D）——Java关键抽象层（CAL）接口生成规范》47页标准

专知会员服务

12+阅读 · 7月16日

美陆军任务式指挥人工智能解决方案

美陆军任务式指挥人工智能解决方案

专知会员服务

12+阅读 · 7月16日

ICML 2026 | 理论级自动形式化：从孤立命题到统一形式化知识库

ICML 2026 | 理论级自动形式化：从孤立命题到统一形式化知识库

专知会员服务

9+阅读 · 7月16日

综述 | 现代智能体自我改进，从模型更新到脚手架演化

综述 | 现代智能体自我改进，从模型更新到脚手架演化

专知会员服务

15+阅读 · 7月16日

美国陆军宣布“项目融合-顶点6”：现代化进程的关键里程碑

美国陆军宣布“项目融合-顶点6”：现代化进程的关键里程碑

专知会员服务

13+阅读 · 7月15日

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

128+阅读 · 2020年11月20日

【SIGIR2020】高效查询自动补全，Efficient and Effective Query Auto-Completion

【SIGIR2020】高效查询自动补全，Efficient and Effective Query Auto-Completion

专知会员服务

10+阅读 · 2020年5月14日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

116+阅读 · 2020年4月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【新书】数据科学编程傻瓜式教程，数据科学编程一体机 | Data Science Programming All-In-One For Dummies

【新书】数据科学编程傻瓜式教程，数据科学编程一体机 | Data Science Programming All-In-One For Dummies

专知会员服务

41+阅读 · 2020年1月22日

【新书】Python数据科学食谱（Python Data Science Cookbook）

【新书】Python数据科学食谱（Python Data Science Cookbook）

专知会员服务

118+阅读 · 2020年1月1日

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

专知会员服务

35+阅读 · 2019年11月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

80+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《边缘端实时无线感知赋能现场多机器人部署》200页

人工智能赋能战场情报：提速决策进程

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

战力倍增器：自主武器系统与乌克兰及加沙冲突

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

已删除

将门创投

5+阅读 · 2018年6月7日

相关论文

Measuring Uncertainty in Translation Quality Evaluation (TQE)

Arxiv

0+阅读 · 2021年11月15日

New Performance Measures for Object Tracking under Complex Environments

Arxiv

0+阅读 · 2021年11月13日

Epistasis Detection Via the Joint Cumulant

Epistasis Detection Via the Joint Cumulant

Arxiv

0+阅读 · 2021年11月12日

A Realistic Evaluation of Semi-Supervised Learning for Fine-Grained Classification

Arxiv

6+阅读 · 2021年4月1日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

18+阅读 · 2019年10月30日

Generative Adversarial Active Learning for Unsupervised Outlier Detection

Generative Adversarial Active Learning for Unsupervised Outlier Detection

Arxiv

5+阅读 · 2019年3月14日

To Cluster, or Not to Cluster: An Analysis of Clusterability Methods

To Cluster, or Not to Cluster: An Analysis of Clusterability Methods

Arxiv

4+阅读 · 2018年8月24日

Automatic multi-objective based feature selection for classification

Automatic multi-objective based feature selection for classification

Arxiv

6+阅读 · 2018年7月9日

An Improved Evaluation Framework for Generative Adversarial Networks

Arxiv

3+阅读 · 2018年3月27日

Active Learning from Positive and Unlabeled Data

Arxiv

3+阅读 · 2016年2月24日

微信扫码咨询专知VIP会员