With the adoption of machine learning-based solutions in routine clinical practice, the need for reliable interpretability tools has become pressing. Shapley values provide local explanations. The method gained popularity in recent years. Here, we reveal current misconceptions about the ``true to the data'' or ``true to the model'' trade-off and demonstrate its importance in a clinical context. We show that the interpretation of Shapley values, which strongly depends on the choice of a reference distribution for modeling feature removal, is often misunderstood. We further advocate that for applications in medicine, the reference distribution should be tailored to the underlying clinical question. Finally, we advise on the right reference distributions for specific medical use cases.
翻译:随着基于机器学习的解决方案在日常临床实践中的采用,对可靠的可解释性工具的需求已变得迫切。Shapley值提供了局部解释,该方法近年来逐渐流行。在此,我们揭示了当前关于"忠于数据"与"忠于模型"权衡的误解,并在临床情境中展示了其重要性。我们表明,Shapley值的解释高度依赖于建模特征移除时参考分布的选择,而这一点常被误解。我们进一步主张,在医学应用中,参考分布应针对具体的临床问题进行定制。最后,我们针对特定的医学用例提出了关于正确参考分布的建议。