Network-on-Chip (NoC) congestion builds up during heavy traffic load and cripples the system performance by stalling the cores. Moreover, congestion leads to wasted link bandwidth due to blocked buffers and bouncing packets. Existing approaches throttle the cores after congestion is detected, reducing efficiency and wasting line bandwidth unnecessarily. In contrast, we propose a lightweight machine learning-based technique that helps predict congestion in the network. Specifically, our proposed technique collects the features related to traffic at each destination. Then, it labels the features using a novel time reversal approach. The labeled data is used to design a low overhead and an explainable decision tree model used at runtime congestion control. Experimental evaluations with synthetic and real traffic on industrial 6$\times$6 NoC show that the proposed approach increases fairness and memory read bandwidth by up to 114\% with respect to existing congestion control technique while incurring less than 0.01\% of overhead.
翻译:片上网络(NoC)在高负载流量下会产生拥塞,并通过阻塞核心降低系统性能。此外,拥塞还会因缓冲器阻塞和数据包反弹导致链路带宽浪费。现有方法在检测到拥塞后对核心进行限流,但这会降低效率并造成不必要的线路带宽浪费。相比之下,我们提出了一种基于机器学习的轻量级技术,用于预测网络中的拥塞。具体而言,所提技术首先收集每个目的节点处与流量相关的特征,然后通过一种新颖的时间反演方法对这些特征进行标注。标注后的数据被用于设计一种低开销且可解释的决策树模型,并在运行时实现拥塞控制。在工业6×6 NoC上使用合成流量和真实流量进行的实验评估表明,与现有的拥塞控制技术相比,所提方法将公平性和内存读取带宽提高了最高114%,同时开销低于0.01%。