The ability to interpret machine learning model decisions is critical in such domains as healthcare, where trust in model predictions is as important as their accuracy. Inspired by the development of prototype parts-based deep neural networks in computer vision, we propose a new model for tabular data, specifically tailored to medical records, that requires discretization of diagnostic result norms. Unlike the original vision models that rely on the spatial structure, our method employs trainable patching over features describing a patient, to learn meaningful prototypical parts from structured data. These parts are represented as binary or discretized feature subsets. This allows the model to express prototypes in human-readable terms, enabling alignment with clinical language and case-based reasoning. Our proposed neural network is inherently interpretable and offers interpretable concept-based predictions by comparing the patient's description to learned prototypes in the latent space of the network. In experiments, we demonstrate that the model achieves classification performance competitive to widely used baseline models on medical benchmark datasets, while also offering transparency, bridging the gap between predictive performance and interpretability in clinical decision support.
翻译:在医疗健康等领域,机器学习模型决策的可解释性至关重要,因为对模型预测的信任度与其准确性同等重要。受计算机视觉中原型部件深度神经网络发展的启发,我们提出了一种专门针对医疗记录表格数据的新模型,该模型要求对诊断结果规范进行离散化处理。与依赖空间结构的原始视觉模型不同,我们的方法通过对描述患者的特征进行可训练的区块划分,从结构化数据中学习有意义的原型部件。这些部件表示为二进制或离散化的特征子集。这使得模型能够以人类可读的方式表达原型,实现与临床语言及基于案例推理的对接。我们提出的神经网络本质上是可解释的,通过在网络的隐空间中将患者描述与学习到的原型进行比较,提供基于可解释概念的预测。在实验中,我们证明该模型在医疗基准数据集上的分类性能与广泛使用的基线模型相当,同时提供了透明度,弥合了临床决策支持中预测性能与可解释性之间的差距。