FloorSet - a VLSI Floorplanning Dataset with Design Constraints of Real-World SoCs

Floorplanning for systems-on-a-chip (SoCs) and its sub-systems is a crucial and non-trivial step of the physical design flow. It represents a difficult combinatorial optimization problem. A typical large scale SoC with 120 partitions generates a search-space of nearly 10E250. As novel machine learning (ML) approaches emerge to tackle such problems, there is a growing need for a modern benchmark that comprises a large training dataset and performance metrics that better reflect real-world constraints and objectives compared to existing benchmarks. To address this need, we present FloorSet - two comprehensive datasets of synthetic fixed-outline floorplan layouts that reflect the distribution of real SoCs. Each dataset has 1M training samples and 100 test samples where each sample is a synthetic floor- plan. FloorSet-Prime comprises fully-abutted rectilinear partitions and near-optimal wire-length. A simplified dataset that reflects early design phases, FloorSet-Lite comprises rectangular partitions, with under 5 percent white-space and near-optimal wire-length. Both datasets define hard constraints seen in modern design flows such as shape constraints, edge-affinity, grouping constraints, and pre-placement constraints. FloorSet is intended to spur fundamental research on large-scale constrained optimization problems. Crucially, FloorSet alleviates the core issue of reproducibility in modern ML driven solutions to such problems. FloorSet is available as an open-source repository for the research community.

翻译：系统级芯片（SoC）及其子系统的布局规划是物理设计流程中关键且复杂的步骤，属于典型的组合优化难题。典型的大规模SoC（含120个分区）会产生近10E250量级的搜索空间。随着新型机器学习方法被应用于解决此类问题，业界亟需构建包含大规模训练数据集、并能更好反映现实约束与设计目标的现代化基准测试平台。为此，我们提出FloorSet——两个反映真实SoC分布特性的综合合成固定边框布局数据集。每个数据集包含100万个训练样本和100个测试样本，每个样本均为合成布局方案。其中FloorSet-Prime采用完全邻接的直线型分区并实现近最优线长；而反映早期设计阶段的简化数据集FloorSet-Lite则采用矩形分区，白空间占比低于5%且线长接近最优。两个数据集均定义了现代设计流程中的硬约束条件，包括形状约束、边缘亲和性约束、分组约束及预布局约束。FloorSet旨在推动大规模约束优化问题的基础研究，其关键价值在于解决了现代机器学习解决方案在该类问题中的可复现性核心难题。该数据集已以开源仓库形式向研究社区开放。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日