Probabilistic Truly Unordered Rule Sets

Rule set learning has recently been frequently revisited because of its interpretability. Existing methods have several shortcomings though. First, most existing methods impose orders among rules, either explicitly or implicitly, which makes the models less comprehensible. Second, due to the difficulty of handling conflicts caused by overlaps (i.e., instances covered by multiple rules), existing methods often do not consider probabilistic rules. Third, learning classification rules for multi-class target is understudied, as most existing methods focus on binary classification or multi-class classification via the ``one-versus-rest" approach. To address these shortcomings, we propose TURS, for Truly Unordered Rule Sets. To resolve conflicts caused by overlapping rules, we propose a novel model that exploits the probabilistic properties of our rule sets, with the intuition of only allowing rules to overlap if they have similar probabilistic outputs. We next formalize the problem of learning a TURS model based on the MDL principle and develop a carefully designed heuristic algorithm. We benchmark against a wide range of rule-based methods and demonstrate that our method learns rule sets that have lower model complexity and highly competitive predictive performance. In addition, we empirically show that rules in our model are empirically ``independent" and hence truly unordered.

翻译：规则集学习因其可解释性近来频繁被重新审视。然而，现有方法存在若干缺陷。首先，大多数现有方法显式或隐式地对规则施加顺序，这降低了模型的可理解性。其次，由于处理重叠（即被多条规则覆盖的实例）引发的冲突存在困难，现有方法通常不考虑概率性规则。第三，针对多类别目标的分类规则学习研究不足，因为大多数现有方法聚焦于二分类或通过"一对多"方法处理多分类。为解决这些缺陷，我们提出TURS（真正无顺序规则集）。为化解重叠规则引起的冲突，我们提出一种新模型，利用规则集的概率属性，其直觉是仅允许具有相似概率输出的规则重叠。接下来，我们基于MDL原则形式化学习TURS模型的问题，并开发精心设计的启发式算法。我们与大量基于规则的方法进行基准测试，证明我们的方法能够学习到模型复杂度更低且预测性能极具竞争力的规则集。此外，我们通过实验表明，我们模型中的规则在经验上是"独立的"，因此是真正无顺序的。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日