Attribute Inference from Interactive Targeted Ads

Targeted advertising systems can pair audiences selected by advertisers with ad units that expose visible user actions. When an interaction remains linked to the campaign that elicited it, the advertiser may receive an observation tied to a user rather than only an aggregate report. We model that channel as a noisy oracle for attribute inference. The model separates targeting predicates, exposure, interaction, and disclosure. These boundaries capture the gap between eligibility and delivery, and the gap between interaction and advertiser visibility. We build a reproducible benchmark using synthetic populations calibrated with public data, each with known sensitive labels. A generated campaign semantics layer provides topic variants and response priors. The simulator generates the ground truth, event traces, disclosed observations, and metrics. The evaluation compares Bayesian, supervised, positive and unlabeled, and adaptive attacks under common campaign and disclosure definitions. The final evaluation uses four topic variants, seven simulator seeds, and two interaction settings. Repeated campaigns with identity exposure produce measurable but bounded inference signal. At $160$ campaigns, Bayesian and supervised attacks reach about $0.64$ AUC in the main setting and about $0.65$ AUC in the higher interaction setting. Disclosure policy is the strongest control. Aggregate reporting removes the evaluated oracle input tied to users. Type filtering and randomized disclosure reduce the released signal. The result is a model, artifact, and defense evaluation method for privacy in interactive targeted advertising. The code is available at https://github.com/P-HOW/Interactive-Ad-Oracle.

翻译：定向广告系统可将广告主选择的受众与展示可见用户行为的广告单元配对。当交互行为仍与触发该行为的营销活动关联时，广告主可能收到与用户个体绑定的观测数据，而非仅聚合报告。我们将该信道建模为用于属性推断的含噪预言机。该模型区分了定向谓词、曝光、交互与披露四个阶段。这些边界捕获了资格判定与广告投放之间的差距，以及用户交互与广告主可见性之间的差距。我们利用公开数据校准的合成人口数据构建可复现基准测试集，各数据点均带有已知的敏感标签。生成的广告语义层提供主题变体与响应先验。仿真器生成地面真值、事件轨迹、披露观测数据及评估指标。评估对比了常见广告活动与披露定义下的贝叶斯攻击、监督攻击、正无标签攻击及自适应攻击。最终评估采用四种主题变体、七种仿真种子及两种交互场景设置。重复执行带身份曝光的广告活动可产生可量化但有限的推断信号。在160次广告活动后，主场景中贝叶斯攻击与监督攻击的AUC值达到约0.64，高交互场景中达到约0.65。披露策略是最强有力的控制手段。聚合报告机制移除了与用户绑定的评估预言机输入。类型过滤与随机化披露可降低释放信号强度。本文最终提出针对交互式定向广告隐私保护的模型、基准工具及防御评估方法。代码见https://github.com/P-HOW/Interactive-Ad-Oracle。