Counterfactual Explanations for Multivariate Time-Series without Training Datasets

Machine learning (ML) methods have experienced significant growth in the past decade, yet their practical application in high-impact real-world domains has been hindered by their opacity. When ML methods are responsible for making critical decisions, stakeholders often require insights into how to alter these decisions. Counterfactual explanations (CFEs) have emerged as a solution, offering interpretations of opaque ML models and providing a pathway to transition from one decision to another. However, most existing CFE methods require access to the model's training dataset, few methods can handle multivariate time-series, and none can handle multivariate time-series without training datasets. These limitations can be formidable in many scenarios. In this paper, we present CFWoT, a novel reinforcement-learning-based CFE method that generates CFEs when training datasets are unavailable. CFWoT is model-agnostic and suitable for both static and multivariate time-series datasets with continuous and discrete features. Users have the flexibility to specify non-actionable, immutable, and preferred features, as well as causal constraints which CFWoT guarantees will be respected. We demonstrate the performance of CFWoT against four baselines on several datasets and find that, despite not having access to a training dataset, CFWoT finds CFEs that make significantly fewer and significantly smaller changes to the input time-series. These properties make CFEs more actionable, as the magnitude of change required to alter an outcome is vastly reduced.

翻译：机器学习方法在过去十年中取得了显著发展，但其在实际高影响领域中的应用仍因其不透明性而受到阻碍。当机器学习方法负责做出关键决策时，利益相关者通常需要了解如何改变这些决策。反事实解释作为一种解决方案应运而生，它能够解释不透明的机器学习模型，并提供从一种决策状态转换到另一种的路径。然而，现有的大多数反事实解释方法都需要访问模型的训练数据集，能够处理多变量时间序列的方法很少，且没有方法能够在缺乏训练数据集的情况下处理多变量时间序列。这些限制在许多场景中可能构成重大障碍。本文提出了CFWoT，一种基于强化学习的新型反事实解释方法，可在训练数据集不可用时生成反事实解释。CFWoT具有模型无关性，适用于包含连续和离散特征的静态数据集及多变量时间序列数据集。用户可以灵活指定不可操作、不可更改及偏好特征，以及CFWoT保证遵循的因果约束。我们在多个数据集上将CFWoT与四种基线方法进行比较，结果表明：尽管无法访问训练数据集，CFWoT找到的反事实解释对输入时间序列所需作出的更改显著更少且幅度显著更小。这些特性使得反事实解释更具可操作性，因为改变决策结果所需的变更幅度被大幅降低。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日