In-context learning (ICL) enables multimodal large language models (MLLMs) to classify images from a few labelled examples. Yet, how these models use the provided context remains opaque. While Chain-of-Thought prompting is widely used, recent work argues that it may not reflect true internal computation. In this paper, we systematically evaluate the concept-based explainability of frozen MLLMs under few-shot ICL using five conditions of increasing formal rigour, ranging from baseline classification to Description Logics (DL) axiom generation. Evaluating four state-of-the-art MLLMs via an independent LLM-as-a-judge pipeline, we demonstrate that explaining is genuinely harder than predicting alone. Surprisingly, forcing models to generate formally structured, concept-based explanations degrades predictive accuracy monotonically (from 93.8% to 90.1%), contradicting the assumption that explicit reasoning universally aids performance. However, when models successfully articulate class-discriminative visual features, explanation quality strongly correlates with correct predictions. Our findings suggest that while MLLMs excel at visual classification, they lack the specific instruction-tuning required for formal, machine-verifiable explainability.
翻译:上下文学习(ICL)使多模态大语言模型(MLLMs)能够从少量标注示例中对图像进行分类。然而,这些模型如何使用提供的上下文仍不透明。尽管思维链提示被广泛使用,但近期研究认为它可能不反映真实的内部计算过程。本文通过五种递增形式化严谨性的条件(从基线分类到描述逻辑(DL)公理生成),系统评估了冻结MLLMs在少样本ICL下基于概念的可解释性。我们利用独立LLM-as-a-judge流程评估四种最先进的MLLMs,证明解释确实比单独预测更难。令人惊讶的是,强制模型生成形式化结构化、基于概念的解释会单调降低预测准确率(从93.8%降至90.1%),这与显式推理普遍提升性能的假设相矛盾。然而,当模型成功阐述类别判别性视觉特征时,解释质量与正确预测高度相关。我们的发现表明,尽管MLLMs擅长视觉分类,但它们缺乏实现形式化、机器可验证可解释性所需的具体指令调优。