As the adoption of explainable AI (XAI) continues to expand, the urgency to address its privacy implications intensifies. Despite a growing corpus of research in AI privacy and explainability, there is little attention on privacy-preserving model explanations. This article presents the first thorough survey about privacy attacks on model explanations and their countermeasures. Our contribution to this field comprises a thorough analysis of research papers with a connected taxonomy that facilitates the categorisation of privacy attacks and countermeasures based on the targeted explanations. This work also includes an initial investigation into the causes of privacy leaks. Finally, we discuss unresolved issues and prospective research directions uncovered in our analysis. This survey aims to be a valuable resource for the research community and offers clear insights for those new to this domain. To support ongoing research, we have established an online resource repository, which will be continuously updated with new and relevant findings. Interested readers are encouraged to access our repository at https://github.com/tamlhp/awesome-privex.
翻译:随着可解释人工智能(XAI)应用的持续扩展,解决其隐私影响的紧迫性日益凸显。尽管人工智能隐私与可解释性领域的研究文献不断增长,但针对隐私保护模型解释的关注却十分有限。本文首次系统综述了针对模型解释的隐私攻击及其防御措施。本领域贡献包括:通过关联分类法对研究文献进行全面分析,依据目标解释类型实现隐私攻击与防御措施的系统归类;同时首次对隐私泄露成因进行了初步探究。最后,我们讨论了分析中发现未解决的问题及潜在研究方向。本综述旨在为研究社群提供有价值的参考资源,并为该领域新进研究者提供清晰的学术洞察。为支持持续研究,我们建立了在线资源库,将持续更新相关最新研究成果。感兴趣的读者可通过 https://github.com/tamlhp/awesome-privex 访问该资源库。