While privacy-focused browsers have taken steps to block third-party cookies and browser fingerprinting, novel tracking methods that bypass existing defenses continue to emerge. Since trackers need to exfiltrate information from the client- to server-side through link decoration regardless of the tracking technique they employ, a promising orthogonal approach is to detect and sanitize tracking information in decorated links. We present PURL, a machine-learning approach that leverages a cross-layer graph representation of webpage execution to safely and effectively sanitize link decoration. Our evaluation shows that PURL significantly outperforms existing countermeasures in terms of accuracy and reducing website breakage while being robust to common evasion techniques. We use PURL to perform a measurement study on top-million websites. We find that link decorations are widely abused by well-known advertisers and trackers to exfiltrate user information collected from browser storage, email addresses, and scripts involved in fingerprinting.
翻译:尽管注重隐私的浏览器已采取措施阻止第三方Cookie和浏览器指纹识别,但绕过现有防御机制的新型追踪方法仍在不断涌现。由于无论采用何种追踪技术,追踪器都需要通过链接装饰(link decoration)将信息从客户端泄露至服务器端,因此一种具有前景的正交方案是检测并净化装饰链接中的追踪信息。我们提出PURL——一种利用网页执行跨层图表示(cross-layer graph representation)的机器学习方法,可安全高效地净化链接装饰。评估结果显示,PURL在准确性和降低网站功能破坏方面显著优于现有防御手段,同时对常见规避技术具有鲁棒性。我们利用PURL对全球前百万网站进行了测量研究,发现知名广告商和追踪器普遍滥用链接装饰,以窃取从浏览器存储、电子邮件地址以及涉及指纹识别的脚本中收集的用户信息。