Fast and accurate performance analysis techniques are essential in early design space exploration and pre-silicon evaluations, including software eco-system development. In particular, on-chip communication continues to play an increasingly important role as the many-core processors scale up. This paper presents the first performance analysis technique that targets networks-on-chip (NoCs) that employ weighted round-robin (WRR) arbitration. Besides fairness, WRR arbitration provides flexibility in allocating bandwidth proportionally to the importance of the traffic classes, unlike basic round-robin and priority-based arbitration. The proposed approach first estimates the effective service time of the packets in the queue due to WRR arbitration. Then, it uses the effective service time to compute the average waiting time of the packets. Next, we incorporate a decomposition technique to extend the analytical model to handle NoC of any size. The proposed approach achieves less than 5% error while executing real applications and 10% error under challenging synthetic traffic with different burstiness levels.
翻译:快速准确的性能分析技术对于早期设计空间探索和硅前评估(包括软件生态系统开发)至关重要。随着众核处理器规模的不断扩大,片上通信的作用日益凸显。本文首次提出针对采用加权轮询仲裁的片上网络(NoC)的性能分析技术。与基本轮询和优先级仲裁相比,加权轮询仲裁除了保证公平性外,还能根据业务流类别的重要性灵活分配带宽比例。该方法首先估计加权轮询仲裁下队列中数据包的有效服务时间,随后利用有效服务时间计算数据包的平均等待时间。最后引入分解技术将该分析模型扩展到任意规模的片上网络。在实际应用场景中,该方法误差低于5%,在具有不同突发等级的挑战性合成流量下,误差仅为10%。