LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation

Directed acyclic graphs (DAGs) serve as crucial data representations in domains such as hardware synthesis and compiler/program optimization for computing systems. DAG generative models facilitate the creation of synthetic DAGs, which can be used for benchmarking computing systems while preserving intellectual property. However, generating realistic DAGs is challenging due to their inherent directional and logical dependencies. This paper introduces LayerDAG, an autoregressive diffusion model, to address these challenges. LayerDAG decouples the strong node dependencies into manageable units that can be processed sequentially. By interpreting the partial order of nodes as a sequence of bipartite graphs, LayerDAG leverages autoregressive generation to model directional dependencies and employs diffusion models to capture logical dependencies within each bipartite graph. Comparative analyses demonstrate that LayerDAG outperforms existing DAG generative models in both expressiveness and generalization, particularly for generating large-scale DAGs with up to 400 nodes-a critical scenario for system benchmarking. Extensive experiments on both synthetic and real-world flow graphs from various computing platforms show that LayerDAG generates valid DAGs with superior statistical properties and benchmarking performance. The synthetic DAGs generated by LayerDAG enhance the training of ML-based surrogate models, resulting in improved accuracy in predicting performance metrics of real-world DAGs across diverse computing platforms.

翻译：有向无环图（DAG）在硬件综合、计算系统编译器/程序优化等领域是至关重要的数据表示形式。DAG生成模型能够促进合成DAG的创建，这些合成DAG可用于对计算系统进行基准测试，同时保护知识产权。然而，由于DAG固有的方向性和逻辑依赖性，生成真实的DAG具有挑战性。本文提出LayerDAG，一种自回归扩散模型，以应对这些挑战。LayerDAG将强节点依赖解耦为可顺序处理的管理单元。通过将节点的偏序关系解释为一系列二分图，LayerDAG利用自回归生成来建模方向依赖，并采用扩散模型来捕捉每个二分图内部的逻辑依赖。对比分析表明，LayerDAG在表达能力和泛化能力上均优于现有DAG生成模型，特别是在生成节点数高达400的大规模DAG时——这是系统基准测试的关键场景。在合成图以及来自多种计算平台的真实流图上进行的广泛实验表明，LayerDAG生成的合法DAG具有优越的统计特性和基准测试性能。由LayerDAG生成的合成DAG增强了基于机器学习的代理模型的训练，从而提高了跨不同计算平台预测真实DAG性能指标的准确性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日