Inverse classification with logistic and softmax classifiers: efficient optimization

In recent years, a certain type of problems have become of interest where one wants to query a trained classifier. Specifically, one wants to find the closest instance to a given input instance such that the classifier's predicted label is changed in a desired way. Examples of these "inverse classification" problems are counterfactual explanations, adversarial examples and model inversion. All of them are fundamentally optimization problems over the input instance vector involving a fixed classifier, and it is of interest to achieve a fast solution for interactive or real-time applications. We focus on solving this problem efficiently with the squared Euclidean distance for two of the most widely used classifiers: logistic regression and softmax classifier. Owing to special properties of these models, we show that the optimization can be solved in closed form for logistic regression, and iteratively but extremely fast for the softmax classifier. This allows us to solve either case exactly (to nearly machine precision) in a runtime of milliseconds to around a second even for very high-dimensional instances and many classes.

翻译：近年来，一类需要查询已训练分类器的问题引起了广泛关注。具体而言，这类问题旨在寻找与给定输入实例距离最近的实例，使得分类器的预测标签按预期方式发生变化。这些"逆分类"问题的典型例子包括反事实解释、对抗样本和模型反演。本质上，这些问题都是关于固定分类器对输入实例向量的优化问题，在交互式或实时应用中实现快速求解具有重要意义。本文聚焦于使用平方欧氏距离，针对两种最广泛使用的分类器（逻辑斯蒂回归和Softmax分类器）高效求解该问题。利用这些模型的特殊性质，我们证明了逻辑斯蒂回归的优化可求得闭式解，而Softmax分类器虽需迭代求解但速度极快。这使得我们能够精确求解（达到接近机器精度）两种情况，即便处理极高维度的实例和大量类别时，运行时间仍在毫秒至秒级范围内。

相关内容

分类器

关注 6

分类是数据挖掘的一种非常重要的方法。分类的概念是在已有数据的基础上学会一个分类函数或构造出一个分类模型（即我们通常所说的分类器(Classifier)）。该函数或模型能够把数据库中的数据纪录映射到给定类别中的某一个，从而可以应用于数据预测。总之，分类器是数据挖掘中对样本进行分类的方法的统称，包含决策树、逻辑回归、朴素贝叶斯、神经网络等算法。

逆优化: 理论与应用

专知会员服务

38+阅读 · 2021年9月13日

【AAAI2021】通过离散优化的可解释序列分类

专知会员服务

18+阅读 · 2020年12月5日

【NeurIPS2020提交论文】通用表示Transformer层的小样本图像分类

专知会员服务

59+阅读 · 2020年6月29日