DeforestVis: Behavior Analysis of Machine Learning Models with Surrogate Decision Stumps - 专知论文

会员服务 ·

0

决策树 · ML · 行为分析 · 分析 · 机器学习模型 ·

2023 年 3 月 31 日

DeforestVis: Behavior Analysis of Machine Learning Models with Surrogate Decision Stumps

翻译：DeforestVis：基于替代决策桩的机器学习模型行为分析

Angelos Chatzimparmpas,Rafael M. Martins,Alexandru C. Telea,Andreas Kerren

from arxiv, This manuscript is currently under review

As the complexity of machine learning (ML) models increases and the applications in different (and critical) domains grow, there is a strong demand for more interpretable and trustworthy ML. One straightforward and model-agnostic way to interpret complex ML models is to train surrogate models, such as rule sets and decision trees, that sufficiently approximate the original ones while being simpler and easier-to-explain. Yet, rule sets can become very lengthy, with many if-else statements, and decision tree depth grows rapidly when accurately emulating complex ML models. In such cases, both approaches can fail to meet their core goal, providing users with model interpretability. We tackle this by proposing DeforestVis, a visual analytics tool that offers user-friendly summarization of the behavior of complex ML models by providing surrogate decision stumps (one-level decision trees) generated with the adaptive boosting (AdaBoost) technique. Our solution helps users to explore the complexity vs fidelity trade-off by incrementally generating more stumps, creating attribute-based explanations with weighted stumps to justify decision making, and analyzing the impact of rule overriding on training instance allocation between one or more stumps. An independent test set allows users to monitor the effectiveness of manual rule changes and form hypotheses based on case-by-case investigations. We show the applicability and usefulness of DeforestVis with two use cases and expert interviews with data analysts and model developers.

翻译：随着机器学习（ML）模型复杂性的提升及其在各类（关键）领域的应用日益广泛，对更具可解释性与可信赖性的ML模型的需求愈发迫切。训练替代模型（如规则集和决策树）是解释复杂ML模型的一种直接且与模型无关的方式——这类模型在更简单、更易解释的同时，能够充分逼近原始模型。然而，规则集可能包含大量if-else语句而变得冗长，决策树在精确模仿复杂ML模型时深度也会迅速增长。此时，这两种方法都可能无法实现其核心目标，即为用户提供模型可解释性。为此，我们提出DeforestVis这一可视化分析工具，通过提供基于自适应增强（AdaBoost）技术生成的替代决策桩（单层决策树），对复杂ML模型的行为进行用户友好的归纳。我们的解决方案通过逐步生成更多决策桩，帮助用户探索复杂度与保真度之间的权衡；借助加权决策桩构建基于属性的解释以辅助决策；并分析规则覆盖对训练实例在多个决策桩间分配的影响。独立的测试集使用户能够监控手动规则修改的有效性，并基于个案研究形成假设。我们通过两个使用案例及与数据分析师、模型开发者的专家访谈，展示了DeforestVis的适用性与实用性。

0

相关内容

决策树

决策树(Decision Tree）是在已知各种情况发生概率的基础上，通过构成决策树来求取净现值的期望值大于等于零的概率，评价项目风险，判断其可行性的决策分析方法，是直观运用概率分析的一种图解法。由于这种决策分支画成图形很像一棵树的枝干，故称决策树。在机器学习中，决策树是一个预测模型，他代表的是对象属性与对象值之间的一种映射关系。Entropy = 系统的凌乱程度，使用算法ID3, C4.5和C5.0生成树算法使用熵。这一度量是基于信息学理论中熵的概念。决策树是一种树形结构，其中每个内部节点表示一个属性上的测试，每个分支代表一个测试输出，每个叶节点代表一种类别。分类树（决策树）是一种十分常用的分类方法。他是一种监管学习，所谓监管学习就是给定一堆样本，每个样本都有一组属性和一个类别，这些类别是事先确定的，那么通过学习得到一个分类器，这个分类器能够对新出现的对象给出正确的分类。这样的机器学习就被称之为监督学习。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

【MIT-ICLR2022】在机器学习模型中注入公平性, Injecting fairness into machine-learning models

【MIT-ICLR2022】在机器学习模型中注入公平性, Injecting fairness into machine-learning models

专知会员服务

22+阅读 · 2022年3月7日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

图挖掘与多关系学习，亚马逊与CMU-WWW2021教程，附161页ppt

专知会员服务

37+阅读 · 2021年4月20日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

【机器学习工具箱(机器学习实用库分类大列表)】《Machine Learning Toolbox》by Amit Chaudhary

【机器学习工具箱(机器学习实用库分类大列表)】《Machine Learning Toolbox》by Amit Chaudhary

专知会员服务

30+阅读 · 2020年7月12日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

116+阅读 · 2020年4月5日

【AAAI 2020】将深度学习与逻辑融合用于信息提取（Integrating Deep Learning with Logic Fusion for Information Extraction）

【AAAI 2020】将深度学习与逻辑融合用于信息提取（Integrating Deep Learning with Logic Fusion for Information Extraction）

专知会员服务

66+阅读 · 2019年12月28日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

【SIGMOD2022教程】高效数据标签的众包实践:聚合、增量重标签和定价，附180页slides

【SIGMOD2022教程】高效数据标签的众包实践:聚合、增量重标签和定价，附180页slides

专知

0+阅读 · 2022年10月20日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

基于社会化感知数据多层次学习的服务推荐

国家自然科学基金

0+阅读 · 2014年12月31日

一种“统计+结构”机器学习理论与方法研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于图论方法的符号网络中重叠聚类算法的研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于量子学习和调节网络的多目标聚类方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向人与Agent混合的多团队协作仿真训练方法研究

国家自然科学基金

19+阅读 · 2012年12月31日

负序列模式挖掘关键技术及其在医保欺诈检测中的应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于调节网络和分布式学习的大数据多目标聚类方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

高可靠系统软件确保若干技术的研究

国家自然科学基金

1+阅读 · 2009年12月31日

面向可维修性设计的复杂装备维修过程物理仿真与力反馈操作技术研究

国家自然科学基金

2+阅读 · 2008年12月31日

基于安全多方计算的数据挖掘隐私保护研究

国家自然科学基金

5+阅读 · 2008年12月31日

Causality-Aided Trade-off Analysis for Machine Learning Fairness

Arxiv

0+阅读 · 2023年5月22日

Urban GeoBIM construction by integrating semantic LiDAR point clouds with as-designed BIM models

Arxiv

0+阅读 · 2023年5月22日

Diffusion Co-Policy for Synergistic Human-Robot Collaborative Tasks

Arxiv

0+阅读 · 2023年5月20日

Foundation Models for Decision Making: Problems, Methods, and Opportunities

Arxiv

37+阅读 · 2023年3月7日

An Overview on Machine Translation Evaluation

An Overview on Machine Translation Evaluation

Arxiv

14+阅读 · 2022年2月22日

Learning Robust Policy against Disturbance in Transition Dynamics via State-Conservative Policy Optimization

Arxiv

14+阅读 · 2021年12月20日

A Survey of Human-in-the-loop for Machine Learning

Arxiv

37+阅读 · 2021年8月2日

Informed Machine Learning -- A Taxonomy and Survey of Integrating Knowledge into Learning Systems

Arxiv

37+阅读 · 2021年5月28日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

37+阅读 · 2019年9月18日

VIP会员

文章信息

相关主题

机器学习模型

最新内容

【剑桥博士论文】智能体-环境协同优化

【剑桥博士论文】智能体-环境协同优化

专知会员服务

3+阅读 · 今天14:33

ACL 2026综述｜多模态基础模型测试时扩展：生成与推理统一框架

ACL 2026综述｜多模态基础模型测试时扩展：生成与推理统一框架

专知会员服务

2+阅读 · 今天14:32

《面向国防应用的无人机选型：一种对比性多模糊多准则决策框架》

《面向国防应用的无人机选型：一种对比性多模糊多准则决策框架》

专知会员服务

9+阅读 · 今天7:05

无人机战争：从乌克兰到中东战场的沙希德（Shahed）无人机分析

无人机战争：从乌克兰到中东战场的沙希德（Shahed）无人机分析

专知会员服务

6+阅读 · 今天6:51

为初级军官战术训练设计生成式人工智能平台

为初级军官战术训练设计生成式人工智能平台

专知会员服务

5+阅读 · 今天6:43

《美空军条令出版物 3-40，反大规模杀伤性武器作战》

《美空军条令出版物 3-40，反大规模杀伤性武器作战》

专知会员服务

4+阅读 · 今天6:40

《美军条令：作战伤员后送保障》

《美军条令：作战伤员后送保障》

专知会员服务

4+阅读 · 今天6:38

《美空军条令出版物 4-0，维持》

《美空军条令出版物 4-0，维持》

专知会员服务

4+阅读 · 今天6:32

《通过自然语言与强化学习奖励机制将军事条令与目标融入AI智能体》

《通过自然语言与强化学习奖励机制将军事条令与目标融入AI智能体》

专知会员服务

9+阅读 · 今天6:30

《基于DIJKSTRA最短路径算法在AFSIM框架中实现高效动态威胁规避路径规划》

《基于DIJKSTRA最短路径算法在AFSIM框架中实现高效动态威胁规避路径规划》

专知会员服务

3+阅读 · 今天6:25

《修正错误与改进设计：运用数据耕耘支持基于智能体的军事仿真模型验证与确认》

《修正错误与改进设计：运用数据耕耘支持基于智能体的军事仿真模型验证与确认》

专知会员服务

4+阅读 · 今天6:24

《基于仿真的空军任务规划优化》

《基于仿真的空军任务规划优化》

专知会员服务

4+阅读 · 今天6:21

《基于离散事件仿真的航空母舰舰载机出动架次生成分析》

《基于离散事件仿真的航空母舰舰载机出动架次生成分析》

专知会员服务

3+阅读 · 今天6:17

《基于语义分割与深度强化学习的战场环境战术路径规划》

《基于语义分割与深度强化学习的战场环境战术路径规划》

专知会员服务

5+阅读 · 今天6:14

ICML 2026 Oral｜大模型为何难被提示纠正？内部先验限制标注适应性

ICML 2026 Oral｜大模型为何难被提示纠正？内部先验限制标注适应性

专知会员服务

4+阅读 · 6月8日

相关VIP内容

【MIT-ICLR2022】在机器学习模型中注入公平性, Injecting fairness into machine-learning models

【MIT-ICLR2022】在机器学习模型中注入公平性, Injecting fairness into machine-learning models

专知会员服务

22+阅读 · 2022年3月7日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

图挖掘与多关系学习，亚马逊与CMU-WWW2021教程，附161页ppt

专知会员服务

37+阅读 · 2021年4月20日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

【机器学习工具箱(机器学习实用库分类大列表)】《Machine Learning Toolbox》by Amit Chaudhary

【机器学习工具箱(机器学习实用库分类大列表)】《Machine Learning Toolbox》by Amit Chaudhary

专知会员服务

30+阅读 · 2020年7月12日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

116+阅读 · 2020年4月5日

【AAAI 2020】将深度学习与逻辑融合用于信息提取（Integrating Deep Learning with Logic Fusion for Information Extraction）

【AAAI 2020】将深度学习与逻辑融合用于信息提取（Integrating Deep Learning with Logic Fusion for Information Extraction）

专知会员服务

66+阅读 · 2019年12月28日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

ACL 2026综述｜多模态基础模型测试时扩展：生成与推理统一框架

无人机战争：从乌克兰到中东战场的沙希德（Shahed）无人机分析

【剑桥博士论文】智能体-环境协同优化

《面向国防应用的无人机选型：一种对比性多模糊多准则决策框架》

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

【SIGMOD2022教程】高效数据标签的众包实践:聚合、增量重标签和定价，附180页slides

【SIGMOD2022教程】高效数据标签的众包实践:聚合、增量重标签和定价，附180页slides

专知

0+阅读 · 2022年10月20日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

Causality-Aided Trade-off Analysis for Machine Learning Fairness

Arxiv

0+阅读 · 2023年5月22日

Urban GeoBIM construction by integrating semantic LiDAR point clouds with as-designed BIM models

Arxiv

0+阅读 · 2023年5月22日

Diffusion Co-Policy for Synergistic Human-Robot Collaborative Tasks

Arxiv

0+阅读 · 2023年5月20日

Foundation Models for Decision Making: Problems, Methods, and Opportunities

Arxiv

37+阅读 · 2023年3月7日

An Overview on Machine Translation Evaluation

An Overview on Machine Translation Evaluation

Arxiv

14+阅读 · 2022年2月22日

Learning Robust Policy against Disturbance in Transition Dynamics via State-Conservative Policy Optimization

Arxiv

14+阅读 · 2021年12月20日

A Survey of Human-in-the-loop for Machine Learning

Arxiv

37+阅读 · 2021年8月2日

Informed Machine Learning -- A Taxonomy and Survey of Integrating Knowledge into Learning Systems

Arxiv

37+阅读 · 2021年5月28日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

37+阅读 · 2019年9月18日

相关基金

基于社会化感知数据多层次学习的服务推荐

国家自然科学基金

0+阅读 · 2014年12月31日

一种“统计+结构”机器学习理论与方法研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于图论方法的符号网络中重叠聚类算法的研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于量子学习和调节网络的多目标聚类方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向人与Agent混合的多团队协作仿真训练方法研究

国家自然科学基金

19+阅读 · 2012年12月31日

负序列模式挖掘关键技术及其在医保欺诈检测中的应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于调节网络和分布式学习的大数据多目标聚类方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

高可靠系统软件确保若干技术的研究

国家自然科学基金

1+阅读 · 2009年12月31日

面向可维修性设计的复杂装备维修过程物理仿真与力反馈操作技术研究

国家自然科学基金

2+阅读 · 2008年12月31日

基于安全多方计算的数据挖掘隐私保护研究

国家自然科学基金

5+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员