Deep neural networks can achieve high accuracy while relying on evidence that is hard to inspect or misaligned with the intended task. Concept Bottleneck Models (CBMs) expose human-interpretable concepts, but most treat concepts as global attributes and do not show how localized evidence is aggregated into a decision. We propose \textbf{SEG-MIL-CBM}, a spatially grounded CBM that decomposes each image into concept-guided regions and classifies it by attention-based aggregation of segment-level concept evidence. The same segment evidence terms form the prediction and the explanation, exposing which regions and concepts support the predicted logit without a separate post-hoc attribution module. Among evaluated CBM-family baselines, SEG-MIL-CBM improves Waterbirds worst-group accuracy from $65.1\%$ to $72.0\%$, reaches $87.4\%$ worst-group accuracy on Pawrious, remains competitive on standard recognition, and attains the best CBM accuracy on CIFAR-100 ($85.3\%$). Segment-level faithfulness experiments on CUB further show that its learned segment ranking matches or improves over evaluated segment-ranking controls.
翻译:暂无翻译