Machine learning (ML) techniques have been applied to high-level synthesis (HLS) flows for quality-of-result (QoR) prediction and design space exploration (DSE). Nevertheless, the scarcity of accessible high-quality HLS datasets and the complexity of building such datasets present challenges. Existing datasets have limitations in terms of benchmark coverage, design space enumeration, vendor extensibility, or lack of reproducible and extensible software for dataset construction. Many works also lack user-friendly ways to add more designs, limiting wider adoption of such datasets. In response to these challenges, we introduce HLSFactory, a comprehensive framework designed to facilitate the curation and generation of high-quality HLS design datasets. HLSFactory has three main stages: 1) a design space expansion stage to elaborate single HLS designs into large design spaces using various optimization directives across multiple vendor tools, 2) a design synthesis stage to execute HLS and FPGA tool flows concurrently across designs, and 3) a data aggregation stage for extracting standardized data into packaged datasets for ML usage. This tripartite architecture ensures broad design space coverage via design space expansion and supports multiple vendor tools. Users can contribute to each stage with their own HLS designs and synthesis results and extend the framework itself with custom frontends and tool flows. We also include an initial set of built-in designs from common HLS benchmarks curated open-source HLS designs. We showcase the versatility and multi-functionality of our framework through seven case studies: I) ML model for QoR prediction; II) Design space sampling; III) Fine-grained parallelism backend speedup; IV) Targeting Intel's HLS flow; V) Adding new auxiliary designs; VI) Integrating published HLS data; VII) HLS tool version regression benchmarking.
翻译:机器学习(ML)技术已被应用于高层次综合(HLS)流程中,以实现结果质量(QoR)预测和设计空间探索(DSE)。然而,可访问的高质量HLS数据集的稀缺性以及构建此类数据集的复杂性带来了挑战。现有数据集在基准测试覆盖范围、设计空间枚举、厂商工具扩展性,或缺乏可复现且可扩展的数据集构建软件方面存在局限。许多工作也缺乏用户友好的方式来添加更多设计,限制了此类数据集的更广泛采用。针对这些挑战,我们提出了HLSFactory,这是一个旨在促进高质量HLS设计数据集整理与生成的综合框架。HLSFactory包含三个主要阶段:1)设计空间扩展阶段,利用跨多个厂商工具的各种优化指令,将单个HLS设计扩展为大规模设计空间;2)设计综合阶段,跨设计并行执行HLS和FPGA工具流程;3)数据聚合阶段,将标准化数据提取并打包成适用于机器学习的数据集。这种三阶段架构通过设计空间扩展确保了广泛的设计空间覆盖,并支持多种厂商工具。用户可以使用自己的HLS设计和综合结果参与每个阶段,并通过自定义前端和工具流程扩展框架本身。我们还提供了一组来自常见HLS基准测试的初始内置设计以及整理的开源HLS设计。我们通过七个案例研究展示了我们框架的多功能性和多用途性:I)用于QoR预测的ML模型;II)设计空间采样;III)细粒度并行后端加速;IV)面向Intel的HLS流程;V)添加新的辅助设计;VI)集成已发布的HLS数据;VII)HLS工具版本回归基准测试。