Active TLS Stack Fingerprinting: Characterizing TLS Server Deployments at Scale

Active measurements can be used to collect server characteristics on a large scale. This kind of metadata can help discovering hidden relations and commonalities among server deployments offering new possibilities to cluster and classify them. As an example, identifying a previously-unknown cybercriminal infrastructures can be a valuable source for cyber-threat intelligence. We propose herein an active measurement-based methodology for acquiring Transport Layer Security (TLS) metadata from servers and leverage it for their fingerprinting. Our fingerprints capture the characteristic behavior of the TLS stack primarily caused by the implementation, configuration, and hardware support of the underlying server. Using an empirical optimization strategy that maximizes information gain from every handshake to minimize measurement costs, we generated 10 general-purpose Client Hellos used as scanning probes to create a large database of TLS configurations used for classifying servers. We fingerprinted 28 million servers from the Alexa and Majestic toplists and two Command and Control (C2) blocklists over a period of 30 weeks with weekly snapshots as foundation for two long-term case studies: classification of Content Delivery Network and C2 servers. The proposed methodology shows a precision of more than 99 % and enables a stable identification of new servers over time. This study describes a new opportunity for active measurements to provide valuable insights into the Internet that can be used in security-relevant use cases.

翻译：主动测量可用于大规模收集服务器特征。这类元数据能够揭示服务器部署间隐藏的关联性与共性，为集群化分类提供新可能。例如，识别先前未知的网络犯罪基础设施可成为网络威胁情报的重要来源。本文提出一种基于主动测量的方法，用于获取服务器的传输层安全（TLS）元数据，并据此实现指纹识别。我们的指纹特征主要捕捉由底层服务器的实现方式、配置与硬件支持所导致的TLS协议栈特性行为。通过采用经验性优化策略（每次握手均最大化信息增益以降低测量成本），我们生成了10种通用型Client Hello探测载荷，构建了用于服务器分类的大型TLS配置数据库。在为期30周内，我们以周快照为基准，对Alexa与Majestic顶级域名列表及两个命令与控制（C2）黑名单中的2800万台服务器进行了指纹采集，并以此作为两项长期案例研究的基础：内容分发网络（CDN）与C2服务器分类。所提方法精度超过99%，可实现对新增服务器的稳定持续识别。本研究揭示了主动测量为互联网提供安全相关洞察的新途径。

相关内容

服务器

关注 14

服务器，也称伺服器，是提供计算服务的设备。由于服务器需要响应服务请求，并进行处理，因此一般来说服务器应具备承担服务并且保障服务的能力。
服务器的构成包括处理器、硬盘、内存、系统总线等，和通用的计算机架构类似，但是由于需要提供高可靠的服务，因此在处理能力、稳定性、可靠性、安全性、可扩展性、可管理性等方面要求较高。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日