Privacy-Preserving Explainable AIoT Application via SHAP Entropy Regularization

The widespread integration of Artificial Intelligence of Things (AIoT) in smart home environments has amplified the demand for transparent and interpretable machine learning models. To foster user trust and comply with emerging regulatory frameworks, the Explainable AI (XAI) methods, particularly post-hoc techniques such as SHapley Additive exPlanations (SHAP), and Local Interpretable Model-Agnostic Explanations (LIME), are widely employed to elucidate model behavior. However, recent studies have shown that these explanation methods can inadvertently expose sensitive user attributes and behavioral patterns, thereby introducing new privacy risks. To address these concerns, we propose a novel privacy-preserving approach based on SHAP entropy regularization to mitigate privacy leakage in explainable AIoT applications. Our method incorporates an entropy-based regularization objective that penalizes low-entropy SHAP attribution distributions during training, promoting a more uniform spread of feature contributions. To evaluate the effectiveness of our approach, we developed a suite of SHAP-based privacy attacks that strategically leverage model explanation outputs to infer sensitive information. We validate our method through comparative evaluations using these attacks alongside utility metrics on benchmark smart home energy consumption datasets. Experimental results demonstrate that SHAP entropy regularization substantially reduces privacy leakage compared to baseline models, while maintaining high predictive accuracy and faithful explanation fidelity. This work contributes to the development of privacy-preserving explainable AI techniques for secure and trustworthy AIoT applications.

翻译：物联网人工智能（AIoT）在智能家居环境中的广泛集成，增强了对透明可解释机器学习模型的需求。为促进用户信任并符合新兴监管框架，可解释人工智能（XAI）方法，特别是事后解释技术，如SHapley可加性解释（SHAP）和局部可解释模型无关解释（LIME），被广泛用于阐明模型行为。然而，近期研究表明，这些解释方法可能无意中暴露敏感的用户属性与行为模式，从而引入新的隐私风险。为解决这些问题，我们提出一种基于SHAP熵正则化的新型隐私保护方法，以减轻可解释AIoT应用中的隐私泄露。我们的方法引入了一种基于熵的正则化目标，在训练过程中惩罚低熵的SHAP属性分布，从而促进特征贡献的更均匀分布。为评估该方法的有效性，我们开发了一套基于SHAP的隐私攻击策略，这些策略通过利用模型解释输出来推断敏感信息。我们通过在基准智能家居能耗数据集上，结合这些攻击与效用指标进行对比评估，验证了所提方法的有效性。实验结果表明，与基线模型相比，SHAP熵正则化在保持高预测精度和忠实解释保真度的同时，显著降低了隐私泄露风险。这项工作为开发安全可信的AIoT应用中的隐私保护可解释AI技术做出了贡献。