Deep neural networks have achieved remarkable performance in various text-based tasks but often lack interpretability, making them less suitable for applications where transparency is critical. To address this, we propose ProtoLens, a novel prototype-based model that provides fine-grained, sub-sentence level interpretability for text classification. ProtoLens uses a Prototype-aware Span Extraction module to identify relevant text spans associated with learned prototypes and a Prototype Alignment mechanism to ensure prototypes are semantically meaningful throughout training. By aligning the prototype embeddings with human-understandable examples, ProtoLens provides interpretable predictions while maintaining competitive accuracy. Extensive experiments demonstrate that ProtoLens outperforms both prototype-based and non-interpretable baselines on multiple text classification benchmarks. Code and data are available at \url{https://anonymous.4open.science/r/ProtoLens-CE0B/}.
翻译:深度神经网络在各种基于文本的任务中取得了显著性能,但通常缺乏可解释性,使其在透明度至关重要的应用中适用性受限。为解决此问题,我们提出ProtoLens,一种基于原型的新型模型,为文本分类提供细粒度的子句级可解释性。ProtoLens采用原型感知跨度提取模块来识别与学习原型相关的文本片段,并通过原型对齐机制确保原型在整个训练过程中保持语义意义。通过将原型嵌入与人类可理解的示例对齐,ProtoLens在保持竞争力的准确性的同时提供可解释的预测。大量实验表明,ProtoLens在多个文本分类基准测试中优于基于原型和不可解释的基线方法。代码和数据可在 \url{https://anonymous.4open.science/r/ProtoLens-CE0B/} 获取。