Latency is a major concern for web rendering engines like those in Chrome, Safari, and Firefox. These engines reduce latency by using an incremental layout algorithm to redraw the page when the user interacts with it. In such an algorithm, elements that change frame-to-frame are marked dirty; only the dirty elements need be processed to draw the next frame, dramatically reducing latency. However, the standard incremental layout algorithm must search the page for dirty elements, accessing a number of auxiliary elements in the process. These auxiliary elements add cache misses and stalled cycles, and are responsible for a sizable fraction of all layout latency. We introduce a new, faster incremental layout algorithm called Spineless Traversal. Spineless Traversal uses a more computationally demanding priority queue algorithm to avoid the need to access auxiliary nodes and thus reduces cache traffic and stalls. This leads to dramatic speedups on the most latency-critical interactions such as hovering, typing, or animations. Moreover, thanks to numerous low-level optimizations, we are able to make Spineless Traversal competitive across the whole spectrum of incremental layout workloads. As a result, across 2216 benchmarks, Spineless Traversal is faster on 78.2% of the benchmark, with a mean speedup of 3.23x concentrated in the most latency-critical interactions such as hovering, typing, and animations.
翻译:延迟是Chrome、Safari和Firefox等网页渲染引擎面临的主要问题。这些引擎通过采用增量布局算法,在用户与页面交互时重绘页面以降低延迟。在此类算法中,逐帧变化的元素被标记为脏元素;仅需处理脏元素即可绘制下一帧,从而显著降低延迟。然而,标准增量布局算法必须在页面中搜索脏元素,此过程需要访问大量辅助元素。这些辅助元素会导致缓存未命中与流水线停滞,在所有布局延迟中占据相当大比重。我们提出一种名为无脊柱遍历的新型快速增量布局算法。该算法采用计算要求更高的优先队列算法,避免访问辅助节点,从而减少缓存流量与流水线停滞。这使得悬停、输入或动画等对延迟最敏感的交互操作获得显著加速。此外,通过大量底层优化,无脊柱遍历算法能在各类增量布局工作负载中保持竞争力。在2216项基准测试中,该算法在78.2%的测试案例中表现更优,平均加速比达3.23倍,且加速效果集中体现在悬停、输入和动画等延迟关键型交互场景。