Achieving consistent time across devices in distributed systems often involves exchanging timestamped messages over a network. Precise time synchronization is crucial for applications such as cellular networks, industrial automation, and transactional databases. However, delay variation in synchronization packets-often caused by congestion from competing traffic-degrades synchronization accuracy. Detecting whether a packet experienced congestion can help improve synchronization through filtering and statistical methods. We propose an in-network congestion indication and filtering mechanism for synchronization messages used in protocols such as the Network Time Protocol (NTP) and Precision Time Protocol (PTP). Network devices mark packets that experienced queuing, allowing clocks to correct errors caused by varying delays. Our approach requires only simple changes at switches or routers, avoiding deep packet inspection or protocol modifications. The method is backward compatible, using standard but currently unused fields in IP, PTP, or NTP headers. We implement our method on a Tofino P4 target and demonstrate an improvement of over 80% in synchronization performance over a single hop. Moreover, we show that the performance of traditional statistical filters, such as min-RTT and median-delay, is improved by 90% over the one-hop hardware setup. We further demonstrate the effectiveness of our proposed method across multiple hops, both analytically and through simulation. Congestion marking improves the root-mean-squared clock offset estimation error by 30% to 80%, depending on network conditions and filtering techniques.
翻译:在分布式系统中实现设备间时间一致性通常涉及通过网络交换带时间戳的消息。精确的时间同步对于蜂窝网络、工业自动化和事务性数据库等应用至关重要。然而,同步数据包的延迟变化(通常由竞争流量引发的拥塞导致)会降低同步精度。检测数据包是否经历拥塞有助于通过过滤和统计方法改进同步性能。我们提出了一种用于网络时间协议(NTP)和精确时间协议(PTP)等协议中同步消息的网内拥塞指示与过滤机制。网络设备对经历排队的数据包进行标记,使时钟能够校正由可变延迟引起的误差。我们的方法仅需对交换机或路由器进行简单修改,无需深度包检测或协议改动。该方法具有向后兼容性,利用IP、PTP或NTP头部中标准但当前未使用的字段。我们在Tofino P4目标平台上实现该方法,并证明在单跳场景下同步性能提升超过80%。此外,在单跳硬件设置中,传统统计过滤器(如最小RTT和中间延迟)的性能提升达90%。我们通过分析和仿真进一步证明了所提方法在多跳场景下的有效性。根据网络条件和过滤技术的不同,拥塞标记可将时钟偏移估计的均方根误差降低30%至80%。