Prototypical self-explainable classifiers have emerged to meet the growing demand for interpretable AI systems. These classifiers are designed to incorporate high transparency in their decisions by basing inference on similarity with learned prototypical objects. While these models are designed with diversity in mind, the learned prototypes often do not sufficiently represent all aspects of the input distribution, particularly those in low density regions. Such lack of sufficient data representation, known as representation bias, has been associated with various detrimental properties related to machine learning diversity and fairness. In light of this, we introduce pantypes, a new family of prototypical objects designed to capture the full diversity of the input distribution through a sparse set of objects. We show that pantypes can empower prototypical self-explainable models by occupying divergent regions of the latent space and thus fostering high diversity, interpretability and fairness.
翻译:原型自解释分类器应运而生,以满足对可解释人工智能系统日益增长的需求。这类分类器通过基于与学习所得原型对象的相似性进行推理,在其决策过程中融入高度透明性。尽管这些模型在设计时考虑了多样性,但学习到的原型往往未能充分代表输入分布的所有方面,尤其是低密度区域中的特征。这种缺乏充分数据代表性的现象,即代表性偏差,与机器学习多样性和公平性相关的各种不良特性存在关联。鉴于此,我们引入了“原型”这一新型原型对象族,旨在通过稀疏的对象集合捕获输入分布的完整多样性。研究表明,原型能够通过占据潜在空间中的不同区域,从而促进高度多样性、可解释性和公平性,增强原型自解释模型的能力。