Red Pill and Blue Pill: Controllable Website Fingerprinting Defense via Dynamic Backdoor Learning

Website fingerprint (WF) attacks, which covertly monitor user communications to identify the web pages they visit, pose a serious threat to user privacy. Existing WF defenses attempt to reduce the attacker's accuracy by disrupting unique traffic patterns; however, they often suffer from the trade-off between overhead and effectiveness, resulting in less usefulness in practice. To overcome this limitation, we introduce Controllable Website Fingerprint Defense (CWFD), a novel defense perspective based on backdoor learning. CWFD exploits backdoor vulnerabilities in neural networks to directly control the attacker's model by designing trigger patterns based on network traffic. Specifically, CWFD injects only incoming packets on the server side into the target web page's traffic, keeping overhead low while effectively poisoning the attacker's model during training. During inference, the defender can influence the attacker's model through a 'red pill, blue pill' choice: traces with the trigger (red pill) lead to misclassification as the target web page, while normal traces (blue pill) are classified correctly, achieving directed control over the defense outcome. We use the Fast Levenshtein-like distance as the optimization objective to compute trigger patterns that can be effectively associated with our target page. Experiments show that CWFD significantly reduces RF's accuracy from 99% to 6% with 74% data overhead. In comparison, FRONT reduces accuracy to only 97% at similar overhead, while Palette achieves 32% accuracy with 48% more overhead. We further validate the practicality of our method in a real Tor network environment.

翻译：网站指纹攻击通过隐蔽监控用户通信以识别其访问的网页，对用户隐私构成严重威胁。现有网站指纹防御试图通过干扰独特流量模式来降低攻击者准确率，但往往面临开销与有效性之间的权衡，导致实际应用价值有限。为突破这一局限，我们提出可控网站指纹防御——一种基于后门学习的新型防御视角。该方法利用神经网络的后门漏洞，通过设计基于网络流量的触发模式，直接控制攻击者模型。具体而言，CWFD仅在服务器端向目标网页流量中注入入站数据包，在保持低开销的同时，有效污染攻击者模型的训练过程。在推理阶段，防御者可通过“红药丸/蓝药丸”选择影响攻击者模型：含触发器的流量轨迹（红药丸）会被误分类为目标网页，而正常轨迹（蓝药丸）则获得正确分类，实现对防御结果的定向控制。我们采用快速类莱文斯坦距离作为优化目标，计算能与目标网页有效关联的触发模式。实验表明，CWFD在74%数据开销下将随机森林分类器的准确率从99%显著降至6%。相比之下，FRONT在相似开销下仅将准确率降至97%，而Palette虽达到32%准确率却需额外增加48%开销。我们进一步在真实Tor网络环境中验证了该方法的实用性。