To what extent are users surveilled on the web, by what technologies, and by whom? We answer these questions by combining passively observed, anonymized browsing data of a large, representative sample of Americans with domain-level data on tracking from Blacklight. We find that nearly all users ($ > 99\%$) encounter at least one ad tracker or third-party cookie over the observation window. More invasive techniques like session recording, keylogging, and canvas fingerprinting are less widespread, but over half of the users visited a site employing at least one of these within the first 48 hours of the start of tracking. Linking trackers to their parent organizations reveals that a single organization, usually Google, can track over $50\%$ of web activity of more than half the users. Demographic differences in exposure are modest and often attenuate when we account for browsing volume. However, disparities by age and race remain, suggesting that what users browse, not just how much, shapes their surveillance risk.
翻译:用户在多大程度上受到网络监控?使用了哪些技术?监控者是谁?我们通过结合被动观测到的大规模代表性美国用户匿名浏览数据与Blacklight提供的域名级追踪数据来回答这些问题。研究发现,在观测窗口期内,几乎所有用户(>99%)都会遇到至少一个广告追踪器或第三方Cookie。会话录制、键盘记录和Canvas指纹识别等更具侵入性的技术虽不普遍,但超过半数的用户在开始追踪后的48小时内访问了至少使用其中一种技术的网站。将追踪器与其母公司关联分析显示,单一组织(通常是Google)能够追踪超过半数用户50%以上的网络活动。不同人口群体在受监控程度上的差异较小,且在考虑浏览量后往往进一步减弱。然而,年龄和种族差异仍然存在,这表明用户浏览的内容(而不仅仅是浏览量)决定了其面临的监控风险。