Network operators and researchers frequently use Internet measurement platforms (IMPs), such as RIPE Atlas, RIPE RIS, or RouteViews for, e.g., monitoring network performance, detecting routing events, topology discovery, or route optimization. To interpret the results of their measurements and avoid pitfalls or wrong generalizations, users must understand a platform's limitations. To this end, this paper studies an important limitation of IMPs, the \textit{bias}, which exists due to the non-uniform deployment of the vantage points. Specifically, we introduce a generic framework to systematically and comprehensively quantify the multi-dimensional (e.g., across location, topology, network types, etc.) biases of IMPs. Using the framework and open datasets, we perform a detailed analysis of biases in IMPs that confirms well-known (to the domain experts) biases and sheds light on less-known or unexplored biases. To facilitate IMP users to obtain awareness of and explore bias in their measurements, as well as further research and analyses (e.g., methods for mitigating bias), we publicly share our code and data, and provide online tools (API, Web app, etc.) that calculate and visualize the bias in measurement setups.
翻译:网络运营商和研究人员经常使用互联网测量平台(IMPs),例如RIPE Atlas、RIPE RIS或RouteViews,用于监控网络性能、检测路由事件、拓扑发现或路由优化等目的。为了正确解读测量结果,避免陷阱或错误泛化,用户必须理解平台的局限性。为此,本文研究了IMPs的一个重要局限性——因观测点非均匀部署而产生的**偏差**。具体而言,我们引入了一个通用框架,用于系统且全面地量化IMPs的多维(例如地理位置、拓扑、网络类型等)偏差。利用该框架和开放数据集,我们详细分析了IMPs中的偏差,既验证了领域专家已知的常见偏差,也揭示了鲜为人知或尚未探索的偏差。为帮助IMPs用户了解并探索其测量中的偏差,以及促进进一步研究(如偏差缓解方法),我们公开了代码和数据,并提供了在线工具(API、Web应用等)用于计算和可视化测量设置中的偏差。