This paper introduces Interpretability-Guided Bi-objective Optimization (IGBO), a framework that trains interpretable models by incorporating structured domain knowledge via a bi-objective formulation. IGBO encodes feature importance hierarchies as a Directed Acyclic Graph (DAG) via Central Limit Theorem-based construction and uses Temporal Integrated Gradients (TIG) to measure feature importance. To address the Out-of-Distribution (OOD) problem in TIG computation, we propose an Optimal Path Oracle that learns data-manifold-aware integration paths. Theoretical analysis establishes convergence properties via a geometric projection mapping $\mathcal{P}$ and proves robustness to mini-batch noise. Central Limit Theorem-based construction of the interpretability DAG ensures statistical validity of edge orientation decisions. Empirical results on time-series data demonstrate IGBO's effectiveness in enforcing DAG constraints with minimal accuracy loss, outperforming standard regularization baselines.
翻译:本文提出可解释性引导的双目标优化框架,该框架通过双目标优化形式融入结构化领域知识以训练可解释模型。IGBO通过基于中心极限定理的构建将特征重要性层次编码为有向无环图,并使用时序积分梯度度量特征重要性。针对TIG计算中的分布外问题,我们提出最优路径预言机来学习数据流形感知的积分路径。理论分析通过几何投影映射$\mathcal{P}$建立收敛性,并证明其对小批量噪声的鲁棒性。基于中心极限定理的可解释性DAG构建确保了边方向决策的统计有效性。在时间序列数据上的实证结果表明,IGBO能以最小精度损失有效实施DAG约束,其性能优于标准正则化基线方法。