In real applications, interaction between machine learning model and domain experts is critical; however, the classical machine learning paradigm that usually produces only a single model does not facilitate such interaction. Approximating and exploring the Rashomon set, i.e., the set of all near-optimal models, addresses this practical challenge by providing the user with a searchable space containing a diverse set of models from which domain experts can choose. We present a technique to efficiently and accurately approximate the Rashomon set of sparse, generalized additive models (GAMs). We present algorithms to approximate the Rashomon set of GAMs with ellipsoids for fixed support sets and use these ellipsoids to approximate Rashomon sets for many different support sets. The approximated Rashomon set serves as a cornerstone to solve practical challenges such as (1) studying the variable importance for the model class; (2) finding models under user-specified constraints (monotonicity, direct editing); (3) investigating sudden changes in the shape functions. Experiments demonstrate the fidelity of the approximated Rashomon set and its effectiveness in solving practical challenges.
翻译:在实际应用中,机器学习模型与领域专家之间的交互至关重要;然而,经典机器学习范式通常只生成单一模型,难以支持此类交互。通过近似并探索拉什蒙集(即所有近优模型的集合),可为用户提供包含多样化模型的可搜索空间,使领域专家能够从中选择,从而解决这一实践挑战。本文提出一种高效且精确近似稀疏广义加性模型(GAMs)拉什蒙集的技术。我们设计了基于椭球体近似固定支撑集上GAMs拉什蒙集的算法,并利用这些椭球体近似多个不同支撑集对应的拉什蒙集。该近似拉什蒙集可作为解决以下实践难题的基石:(1)研究模型类中变量的重要性;(2)在用户指定约束(单调性、直接编辑)下寻找模型;(3)探究形状函数的突变。实验验证了近似拉什蒙集的保真度及其解决实践难题的有效性。