Tracking pixels are used to optimize online ad campaigns through personalization, re-targeting, and conversion tracking. Past research has primarily focused on detecting the prevalence of tracking pixels on the web, with limited attention to how they are configured across websites. A tracking pixel may be configured differently on different websites. In this paper, we present a differential analysis framework: PixelConfig, to reverse-engineer the configurations of Meta Pixel deployments across the web. Using this framework, we investigate three types of Meta Pixel configurations: activity tracking (i.e., what a user is doing on a website), identity tracking (i.e., who a user is or who the device is associated with), and tracking restrictions (i.e., mechanisms to limit the sharing of potentially sensitive information). Using data from the Internet Archive's Wayback Machine, we analyze and compare Meta Pixel configurations on 18K health-related websites with a control group of the top 10K websites from 2017 to 2024. We find that activity tracking features, such as automatic events that collect button clicks and page metadata, and identity tracking features, such as first-party cookies that are unaffected by third-party cookie blocking, reached adoption rates of up to 98.4%, largely driven by the Pixel's default settings. We also find that the Pixel is being used to track potentially sensitive information, such as user interactions related to booking medical appointments and button clicks associated with specific medical conditions (e.g., erectile dysfunction) on health-related websites. Tracking restriction features, such as Core Setup, are configured on up to 34.3% of health websites and 8.7% of control websites. However, even when enabled, these tracking restriction features provide limited protection and can be circumvented in practice.
翻译:跟踪像素通过个性化、重定向和转化追踪来优化在线广告活动。以往的研究主要集中于检测网络跟踪像素的普遍性,对其在不同网站上的配置方式关注有限。同一跟踪像素在不同网站上可能具有不同的配置。本文提出了一种差分分析框架:PixelConfig,用于逆向工程分析Meta Pixel在网络上的部署配置。利用该框架,我们研究了三种Meta Pixel配置类型:活动跟踪(即用户在网站上的行为)、身份跟踪(即用户身份或设备关联信息)以及跟踪限制(即限制共享潜在敏感信息的机制)。通过使用互联网档案馆Wayback Machine的数据,我们分析并比较了2017年至2024年间18,000个健康相关网站与10,000个顶级对照组网站的Meta Pixel配置。研究发现,活动跟踪功能(如收集按钮点击和页面元数据的自动事件)和身份跟踪功能(如不受第三方Cookie屏蔽影响的第一方Cookie)的采用率高达98.4%,这主要源于Pixel的默认设置。我们还发现,Pixel被用于追踪潜在敏感信息,例如健康网站上与预约医疗就诊相关的用户交互,以及与特定医疗状况(如勃起功能障碍)相关的按钮点击。跟踪限制功能(如Core Setup)在健康网站上的配置比例高达34.3%,在对照组网站中为8.7%。然而,即使启用这些跟踪限制功能,其提供的保护也有限,且在实践中可能被规避。