Understanding the decision-making process of a machine/deep learning model is crucial, particularly in security-sensitive applications. In this study, we introduce a neural network framework that combines the global and exact interpretability properties of rule-based models with the high performance of deep neural networks. Our proposed framework, called $\textit{Truth Table rules}$ (TT-rules), is built upon $\textit{Truth Table nets}$ (TTnets), a family of deep neural networks initially developed for formal verification. By extracting the set of necessary and sufficient rules $\mathcal{R}$ from the trained TTnet model (global interpretability), yielding the same output as the TTnet (exact interpretability), TT-rules effectively transforms the neural network into a rule-based model. This rule-based model supports binary classification, multi-label classification, and regression tasks for tabular datasets. Furthermore, our TT-rules framework optimizes the rule set $\mathcal{R}$ into $\mathcal{R}_{opt}$ by reducing the number and size of the rules. To enhance model interpretation, we leverage Reduced Ordered Binary Decision Diagrams (ROBDDs) to visualize these rules effectively. After outlining the framework, we evaluate the performance of TT-rules on seven tabular datasets from finance, healthcare, and justice domains. We also compare the TT-rules framework to state-of-the-art rule-based methods. Our results demonstrate that TT-rules achieves equal or higher performance compared to other interpretable methods while maintaining a balance between performance and complexity. Notably, TT-rules presents the first accurate rule-based model capable of fitting large tabular datasets, including two real-life DNA datasets with over 20K features. Finally, we extensively investigate a rule-based model derived from TT-rules using the Adult dataset.
翻译:理解机器学习/深度学习模型的决策过程至关重要,尤其是在安全敏感型应用中。本研究提出了一种神经网络框架,该框架结合了基于规则的模型的全局可解释性和精确可解释性,以及深度神经网络的高性能。所提出的框架被命名为$\textit{真值表规则}$(TT-rules),它建立在$\textit{真值表网络}$(TTnets)的基础上,后者是一类最初为形式化验证而开发的深度神经网络。通过从训练好的TTnet模型中提取一组必要且充分的规则$\mathcal{R}$(全局可解释性),并使其输出与TTnet一致(精确可解释性),TT-rules有效地将神经网络转化为基于规则的模型。该基于规则的模型支持表格数据集的二分类、多标签分类和回归任务。此外,TT-rules框架通过减少规则的数量和大小,将规则集$\mathcal{R}$优化为$\mathcal{R}_{opt}$。为增强模型解释性,我们利用简化有序二元决策图(ROBDDs)有效可视化这些规则。概述框架后,我们在来自金融、医疗和司法领域的七个表格数据集上评估了TT-rules的性能,并将其与最先进的基于规则的方法进行了比较。结果表明,TT-rules在保持性能与复杂度平衡的同时,达到了与其他可解释方法相当或更高的性能。值得关注的是,TT-rules是首个能够拟合大规模表格数据集(包括两个包含超过20000个特征的真实DNA数据集)的精确基于规则的模型。最后,我们使用Adult数据集对基于TT-rules推导出的规则模型进行了深入研究。