Scalable Explainability-as-a-Service (XaaS) for Edge AI Systems

Though Explainable AI (XAI) has made significant advancements, its inclusion in edge and IoT systems is typically ad-hoc and inefficient. Most current methods are "coupled" in such a way that they generate explanations simultaneously with model inferences. As a result, these approaches incur redundant computation, high latency and poor scalability when deployed across heterogeneous sets of edge devices. In this work we propose Explainability-as-a-Service (XaaS), a distributed architecture for treating explainability as a first-class system service (as opposed to a model-specific feature). The key innovation in our proposed XaaS architecture is that it decouples inference from explanation generation allowing edge devices to request, cache and verify explanations subject to resource and latency constraints. To achieve this, we introduce three main innovations: (1) A distributed explanation cache with a semantic similarity based explanation retrieval method which significantly reduces redundant computation; (2) A lightweight verification protocol that ensures the fidelity of both cached and newly generated explanations; and (3) An adaptive explanation engine that chooses explanation methods based upon device capability and user requirement. We evaluated the performance of XaaS on three real-world edge-AI use cases: (i) manufacturing quality control; (ii) autonomous vehicle perception; and (iii) healthcare diagnostics. Experimental results show that XaaS reduces latency by 38\% while maintaining high explanation quality across three real-world deployments. Overall, this work enables the deployment of transparent and accountable AI across large scale, heterogeneous IoT systems, and bridges the gap between XAI research and edge-practicality.

翻译：尽管可解释人工智能（XAI）已取得显著进展，但其在边缘和物联网系统中的集成通常缺乏系统性且效率低下。当前大多数方法以"紧耦合"的方式实现，即在模型推理的同时生成解释。因此，当这些方法部署在异构的边缘设备集群时，会产生冗余计算、高延迟和较差的可扩展性。本文提出解释即服务（XaaS），一种将解释性作为一等系统服务（而非模型特定功能）的分布式架构。我们提出的XaaS架构的核心创新在于将推理与解释生成解耦，允许边缘设备在资源和延迟约束下请求、缓存和验证解释。为实现此目标，我们引入了三项主要创新：（1）基于语义相似度的解释检索分布式解释缓存，显著减少冗余计算；（2）轻量级验证协议，确保缓存及新生成解释的保真度；（3）自适应解释引擎，根据设备能力和用户需求选择解释方法。我们在三个真实边缘AI应用场景中评估了XaaS的性能：（i）制造业质量控制；（ii）自动驾驶车辆感知；（iii）医疗健康诊断。实验结果表明，在三个实际部署场景中，XaaS在保持高解释质量的同时将延迟降低了38%。总体而言，本研究实现了透明可信AI在大规模异构物联网系统中的部署，弥合了XAI研究与边缘应用实践之间的鸿沟。