Internet-wide scanning is commonly used to understand the topology and security of the Internet. However, IPv4 Internet scans have been limited to scanning only a subset of services -- exhaustively scanning all IPv4 services is too costly and no existing bandwidth-saving frameworks are designed to scan IPv4 addresses across all ports. In this work we introduce GPS, a system that efficiently discovers Internet services across all ports. GPS runs a predictive framework that learns from extremely small sample sizes and is highly parallelizable, allowing it to quickly find patterns between services across all 65K ports and a myriad of features. GPS computes service predictions in 13 minutes (four orders of magnitude faster than prior work) and finds 92.5% of services across all ports with 131x less bandwidth, and 204x more precision, compared to exhaustive scanning. GPS is the first work to show that, given at least two responsive IP addresses on a port to train from, predicting the majority of services across all ports is possible and practical.
翻译:互联网范围的扫描通常用于理解互联网的拓扑和安全性。然而,IPv4互联网扫描一直局限于仅扫描部分服务——穷尽扫描所有IPv4服务成本过高,且现有带宽节约框架并非设计用于跨所有端口扫描IPv4地址。在本工作中,我们提出了GPS系统,该系统能够高效地发现所有端口上的互联网服务。GPS运行一个预测框架,该框架能从极小的样本量中学习,且高度并行化,使其能快速发现所有65K端口上服务与众多特征之间的模式。GPS在13分钟内计算出服务预测(比先前工作快四个数量级),并以131倍更低的带宽和204倍更高的精度,发现所有端口上92.5%的服务,相比穷尽扫描而言。GPS是首个展示在给定至少两个响应IP地址作为训练样本的情况下,跨所有端口预测大多数服务是可行且实用的工作。