2.5D integration technology is gaining traction as it copes with the exponentially growing design cost of modern integrated circuits. A crucial part of a 2.5D stacked chip is a low-latency and high-throughput inter-chiplet interconnect (ICI). Two major factors affecting the latency and throughput are the topology of links between chiplets and the chiplet placement. In this work, we present PlaceIT, a novel methodology to jointly optimize the ICI topology and the chiplet placement. While state-of-the-art methods optimize the chiplet placement for a predetermined ICI topology, or they select one topology out of a set of candidates, we generate a completely new topology for each placement. Our process of inferring placement-based ICI topologies connects chiplets that are in close proximity to each other, making it particularly attractive for chips with silicon bridges or passive silicon interposers with severely limited link lengths. We provide an open-source implementation of our method that optimizes the placement of homogeneously or heterogeneously shaped chiplets and the ICI topology connecting them for a user-defined mix of four different traffic types. We evaluate our methodology using synthetic traffic and traces, and we compare our results to a 2D mesh baseline. PlaceIT reduces the latency of synthetic L1-to-L2 and L2-to-memory traffic, the two most important types for cache coherency traffic, by up to 28% and 62%, respectively. It also achieve an average packet latency reduction of up to 18% on traffic traces. PlaceIT enables the construction of 2.5D stacked chips with low-latency ICIs.
翻译:2.5D集成技术因其能应对现代集成电路设计成本指数级增长而日益受到关注。2.5D堆叠芯片的关键组成部分是低延迟、高吞吐量的芯粒间互连网络。影响延迟和吞吐量的两个主要因素是芯粒间的链路拓扑结构及芯粒布局。本研究提出PlaceIT——一种联合优化芯粒间互连拓扑与芯粒布局的创新方法。现有先进方法通常针对预设互连拓扑优化芯粒布局,或从候选拓扑集中选择方案,而本方法能为每个布局生成全新的拓扑结构。我们提出的基于布局的互连拓扑推断机制,通过连接空间邻近的芯粒实现优化,这对采用硅桥或链路长度严格受限的无源硅中介层的芯片具有显著优势。我们提供了该方法的开源实现,可针对用户定义的四种流量类型混合场景,优化同构或异构形态芯粒的布局及其互连拓扑。通过合成流量与真实流量轨迹的评估,并与二维网格基线进行对比:PlaceIT将缓存一致性流量中最重要的L1-to-L2与L2-to-memory合成流量延迟分别降低达28%和62%;在真实流量轨迹上实现平均数据包延迟降低达18%。PlaceIT为构建具有低延迟芯粒间互连的2.5D堆叠芯片提供了有效解决方案。