As program workloads (e.g., AI) increase in size and algorithmic complexity, the primary challenge lies in their high dimensionality, encompassing computing cores, array sizes, and memory hierarchies. To overcome these obstacles, innovative approaches are required. Agile chip design has already benefited from machine learning integration at various stages, including logic synthesis, placement, and routing. With Large Language Models (LLMs) recently demonstrating impressive proficiency in Hardware Description Language (HDL) generation, it is promising to extend their abilities to 2.5D integration, an advanced technique that saves area overhead and development costs. However, LLM-driven chiplet design faces challenges such as flatten design, high validation cost and imprecise parameter optimization, which limit its chiplet design capability. To address this, we propose MAHL, a hierarchical LLM-based chiplet design generation framework that features six agents which collaboratively enable AI algorithm-hardware mapping, including hierarchical description generation, retrieval-augmented code generation, diverseflow-based validation, and multi-granularity design space exploration. These components together enhance the efficient generation of chiplet design with optimized Power, Performance and Area (PPA). Experiments show that MAHL not only significantly improves the generation accuracy of simple RTL design, but also increases the generation accuracy of real-world chiplet design, evaluated by Pass@5, from 0 to 0.72 compared to conventional LLMs under the best-case scenario. Compared to state-of-the-art CLARIE (expert-based), MAHL achieves comparable or even superior PPA results under certain optimization objectives.
翻译:随着程序工作负载(如人工智能)在规模和算法复杂度上的增长,主要挑战在于其高维度特性,涉及计算核心、阵列规模和内存层次结构。为克服这些障碍,需要创新的方法。敏捷芯片设计已在多个阶段受益于机器学习的集成,包括逻辑综合、布局和布线。随着大语言模型(LLMs)近期在硬件描述语言(HDL)生成方面展现出令人印象深刻的能力,将其能力扩展至2.5D集成这一能节省面积开销和开发成本的先进技术前景广阔。然而,LLM驱动的芯粒设计面临扁平化设计、高昂验证成本和不精确参数优化等挑战,限制了其芯粒设计能力。为此,我们提出MAHL,一种基于分层LLM的芯粒设计生成框架,该框架包含六个智能体,协同实现AI算法-硬件映射,包括分层描述生成、检索增强的代码生成、基于多样化流程的验证以及多粒度设计空间探索。这些组件共同提升了具有优化功耗、性能和面积(PPA)的芯粒设计的高效生成。实验表明,与传统的LLMs在最佳情况下的表现相比,MAHL不仅显著提高了简单RTL设计的生成准确率,还将实际芯粒设计的生成准确率(以Pass@5评估)从0提升至0.72。与最先进的基于专家系统的CLARIE相比,MAHL在特定优化目标下实现了相当甚至更优的PPA结果。