Network function (NF) offloading on SmartNICs has been widely used in modern data centers, offering benefits in host resource saving and programmability. Co-running NFs on the same SmartNICs can cause performance interference due to contention of onboard resources. To meet performance SLAs while ensuring efficient resource management, operators need mechanisms to predict NF performance under such contention. However, existing solutions lack SmartNIC-specific knowledge and exhibit limited traffic awareness, leading to poor accuracy for on-NIC NFs. This paper proposes Yala, a novel performance predictive system for on-NIC NFs. Yala builds upon the key observation that co-located NFs contend for multiple resources, including onboard accelerators and the memory subsystem. It also facilitates traffic awareness according to the behaviors of individual resources to maintain accuracy as the external traffic attributes vary. Evaluation using BlueField-2 SmartNICs shows that Yala improves the prediction accuracy by 78.8% and reduces SLA violations by 92.2% compared to state-of-the-art approaches, and enables new practical usecases.
翻译:在现代数据中心中,智能网卡上的网络功能卸载技术已被广泛应用,其在节省主机资源与提升可编程性方面具有显著优势。然而,在同一智能网卡上协同运行多个网络功能会因板载资源争用而导致性能干扰。为在确保高效资源管理的同时满足性能服务等级协议,运营商需要能够预测此类争用条件下网络功能性能的机制。但现有解决方案缺乏针对智能网卡的专门知识,且流量感知能力有限,导致对网卡网络功能的预测准确度不佳。本文提出Yala——一种面向网卡网络功能的新型性能预测系统。Yala基于一个关键发现:共置的网络功能会争用包括板载加速器与内存子系统在内的多种资源。该系统还能根据各资源的行为特征实现流量感知,从而在外界流量属性变化时保持预测准确性。基于BlueField-2智能网卡的评估表明,与前沿方法相比,Yala将预测准确率提升了78.8%,并将SLA违规率降低了92.2%,同时实现了新的实际应用场景。