The widespread adoption of Large Language Models (LLMs) has raised significant privacy concerns regarding the exposure of personally identifiable information (PII) in user prompts. To address this challenge, we propose a query-unrelated PII masking strategy and introduce PII-Bench, the first comprehensive evaluation framework for assessing privacy protection systems. PII-Bench comprises 2,842 test samples across 55 fine-grained PII categories, featuring diverse scenarios from single-subject descriptions to complex multi-party interactions. Each sample is carefully crafted with a user query, context description, and standard answer indicating query-relevant PII. Our empirical evaluation reveals that while current models perform adequately in basic PII detection, they show significant limitations in determining PII query relevance. Even state-of-the-art LLMs struggle with this task, particularly in handling complex multi-subject scenarios, indicating substantial room for improvement in achieving intelligent PII masking.
翻译:大型语言模型(LLM)的广泛应用引发了关于用户提示中个人可识别信息(PII)暴露的重大隐私担忧。为应对这一挑战,我们提出了一种与查询无关的PII掩码策略,并引入了首个用于评估隐私保护系统的综合性评估框架——PII-Bench。该框架包含涵盖55个细粒度PII类别的2,842个测试样本,涵盖了从单主体描述到复杂多方交互的多样化场景。每个样本均精心构建了用户查询、上下文描述以及标注查询相关PII的标准答案。我们的实证评估表明,当前模型在基础PII检测方面表现尚可,但在判定PII查询相关性方面存在显著局限。即使是当前最先进的LLM在此任务上亦面临困难,尤其在处理复杂多主体场景时,这表明实现智能PII掩码仍存在巨大的改进空间。