MAC randomization is a widely used technique implemented on most modern smartphones to protect user's privacy against tracking based on Probe Request frames capture. However, there exist weaknesses in such a methodology which may still expose distinctive information, allowing to track the device generating the Probe Requests. Such techniques, known as MAC de-randomization algorithms, generally exploit Information Elements (IEs) contained in the Probe Requests and use clustering methodologies to group together frames belonging to the same device. While effective on heterogeneous device types, such techniques are not able to differentiate among devices of identical type and running the same Operating System (OS). In this paper, we propose a MAC de-randomization technique able to overcome such a weakness. First, we propose a new dataset of Probe Requests captured from devices sharing the same characteristics. Secondly, we observe that the time-frequency pattern of Probe Request emission is unique among devices and can therefore be used as a discriminative feature. We embed such a feature in a two-stage clustering methodology and show through experiments its effectiveness compared to state-of-the-art techniques based solely on IEs fingerprinting. The original dataset used in this work is made publicly available for reproducible research.
翻译:MAC随机化是一种广泛应用于现代智能手机的技术,旨在通过保护探测请求帧捕获来防止基于此的用户跟踪。然而,该方法存在弱点,仍可能暴露独特信息,从而允许跟踪生成探测请求的设备。这类被称为MAC去随机化算法的技术,通常利用探测请求中包含的信息元素,并采用聚类方法将属于同一设备的帧归为一组。尽管这些方法对异构设备类型有效,但无法区分相同类型且运行相同操作系统的设备。本文提出一种能够克服此弱点的MAC去随机化技术。首先,我们构建了一个新的探测请求数据集,该数据集采集自具有相同特征的设备。其次,我们观察到探测请求发射的时频模式在设备间具有独特性,因此可作为区分性特征。我们将此特征嵌入到两阶段聚类方法中,并通过实验证明,相较于仅基于信息元素指纹识别的先进技术,该方法具有显著优势。本研究使用的原始数据集已公开,以支持可重复研究。