Network operators and researchers frequently use Internet measurement platforms (IMPs), such as RIPE Atlas, RIPE RIS, or RouteViews for, e.g., monitoring network performance, detecting routing events, topology discovery, or route optimization. To interpret the results of their measurements and avoid pitfalls or wrong generalizations, users must understand a platform's limitations. To this end, this paper studies an important limitation of IMPs, the \textit{bias}, which exists due to the non-uniform deployment of the vantage points. Specifically, we introduce a generic framework to systematically and comprehensively quantify the multi-dimensional (e.g., across location, topology, network types, etc.) biases of IMPs. Using the framework and open datasets, we perform a detailed analysis of biases in IMPs that confirms well-known (to the domain experts) biases and sheds light on less-known or unexplored biases. To facilitate IMP users to obtain awareness of and explore bias in their measurements, as well as further research and analyses (e.g., methods for mitigating bias), we publicly share our code and data, and provide online tools (API, Web app, etc.) that calculate and visualize the bias in measurement setups.
翻译:网络运营者和研究人员经常使用互联网测量平台(IMP),例如RIPE Atlas、RIPE RIS或RouteViews,用于监测网络性能、检测路由事件、拓扑发现或路由优化等场景。为正确解读测量结果、避免误判或错误泛化,用户必须理解平台的局限性。为此,本文研究了IMP的一个重要局限性——**偏差**,这种偏差源于监测点部署的非均匀性。具体而言,我们提出一个通用框架,用于系统且全面地量化IMP的多维偏差(例如跨地域、拓扑、网络类型等维度)。基于该框架与公开数据集,我们对IMP中的偏差进行了详细分析,既验证了领域专家熟知的已知偏差,也揭示了鲜为人知或尚未探索的偏差。为使IMP用户能够感知并探索其测量中的偏差,同时促进进一步研究(如偏差缓解方法),我们公开共享了代码与数据,并提供了用于计算和可视化测量设置偏差的在线工具(API、Web应用等)。