Privacy leakage in AI-based decision processes poses significant risks, particularly when sensitive information can be inferred. We propose a formal framework to audit privacy leakage using abductive explanations, which identifies minimal sufficient evidence justifying model decisions and determines whether sensitive information disclosed. Our framework formalizes both individual and system-level leakage, introducing the notion of Potentially Applicable Explanations (PAE) to identify individuals whose outcomes can shield those with sensitive features. This approach provides rigorous privacy guarantees while producing human understandable explanations, a key requirement for auditing tools. Experimental evaluation on the German Credit Dataset illustrates how the importance of sensitive literal in the model decision process affects privacy leakage. Despite computational challenges and simplifying assumptions, our results demonstrate that abductive reasoning enables interpretable privacy auditing, offering a practical pathway to reconcile transparency, model interpretability, and privacy preserving in AI decision-making.
翻译:基于人工智能的决策过程中的隐私泄露问题带来了显著风险,尤其是在敏感信息可能被推断的情况下。我们提出了一个利用溯因解释来审计隐私泄露的形式化框架,该框架通过识别能够证明模型决策的最小充分证据,并判定敏感信息是否被披露。我们的框架形式化了个人层面和系统层面的泄露,引入了“潜在适用解释”(PAE)的概念,以识别那些结果能够掩盖具有敏感特征个体的个体。该方法在生成人类可理解的解释的同时提供了严格的隐私保证,这是审计工具的关键要求。在德国信用数据集上的实验评估展示了敏感字面量在模型决策过程中的重要性如何影响隐私泄露。尽管存在计算挑战和简化假设,我们的结果表明溯因推理能够实现可解释的隐私审计,为在AI决策中协调透明度、模型可解释性和隐私保护提供了一条实用路径。