Machine learning is central to modern science, industry, and policy, yet its predictive power often comes at the cost of transparency: we rarely know which input features truly drive a model's predictions. Without such understanding, researchers cannot draw reliable conclusions, practitioners cannot ensure fairness or accountability, and policymakers cannot trust or govern model-based decisions. Existing tools for assessing feature influence are limited; most lack statistical guarantees, and many require costly retraining or surrogate modeling, making them impractical for large modern models. We introduce AICO, a broadly applicable framework that turns model interpretability into an efficient statistical exercise. AICO tests whether each feature genuinely improves predictive performance by masking its information and measuring the resulting change. The method provides exact, finite-sample feature p-values and confidence intervals for feature importance through a simple, non-asymptotic hypothesis testing procedure. It requires no retraining, surrogate modeling, or distributional assumptions, making it feasible for large-scale algorithms. In both controlled experiments and real applications, from credit scoring to mortgage-behavior prediction, AICO reliably identifies the variables that drive model behavior, providing a scalable and statistically principled path toward transparent and trustworthy machine learning.
翻译:机器学习是现代科学、工业与政策的核心,但其预测能力往往以牺牲可解释性为代价:我们很少知道哪些输入特征真正驱动了模型的预测。缺乏这种理解,研究者无法得出可靠结论,实践者无法确保公平性或问责性,决策者也无法信任或监管基于模型的决策。现有的特征影响评估工具存在局限性,多数缺乏统计保证,且许多方法需要昂贵的重新训练或替代建模,因而对大型现代模型不实用。我们提出AICO,一个广泛适用的框架,将模型可解释性转化为高效的统计问题。AICO通过掩蔽特征信息并测量由此产生的变化,检验每个特征是否真正提升预测性能。该方法通过简单、非渐近的假设检验过程,提供精确的有限样本特征p值和特征重要性的置信区间。它无需重新训练、替代建模或分布假设,因此适用于大规模算法。在控制实验和实际应用中(从信用评分到抵押贷款行为预测),AICO可靠地识别了驱动模型行为的变量,为透明且可信赖的机器学习提供了可扩展且统计原则化的路径。