Many modern embedded systems have end-to-end (EtoE) latency constraints that necessitate precise timing to ensure high reliability and functional correctness. The combination of High-Level Synthesis (HLS) and Design Space Exploration (DSE) enables the rapid generation of embedded systems using various constraints/directives to find Pareto-optimal configurations. Current HLS DSE approaches often address latency by focusing on individual components, without considering the EtoE latency during the system-level optimization process. However, to truly optimize the system under EtoE latency, we need a holistic approach that analyzes individual system components' timing constraints in the context of how the different components interact and impact the overall design. This paper presents a novel system-level HLS DSE approach, called EtoE-DSE, that accommodates EtoE latency and variable timing constraints for complex multi-component application-specific embedded systems. EtoE-DSE employs a latency estimation model and a pathfinding algorithm to identify and estimate the EtoE latency for paths between any endpoints. It also uses a frequency-based segmentation process to segment and prune the design space, alongside a latency-constrained optimization algorithm for efficiently and accurately exploring the system-level design space. We evaluate our approach using a real-world use case of an autonomous driving subsystem compared to the state-of-the-art in HLS DSE. We show that our approach yields substantially better optimization results than prior DSE approaches, improving the quality of results by up to 89.26%, while efficiently identifying Pareto-optimal configurations in terms of energy and area.
翻译:许多现代嵌入式系统具有端到端(EtoE)延迟约束,需要精确的时序以确保高可靠性和功能正确性。高层次综合(HLS)与设计空间探索(DSE)的结合,能够利用多种约束/指令快速生成嵌入式系统,以寻找帕累托最优配置。当前的HLS DSE方法通常通过关注单个组件来处理延迟问题,而未在系统级优化过程中考虑端到端延迟。然而,为了在端到端延迟约束下真正优化系统,我们需要一种整体性方法,在不同组件如何交互并影响整体设计的背景下,分析各系统组件的时序约束。本文提出了一种新颖的系统级HLS DSE方法,称为EtoE-DSE,适用于复杂多组件专用嵌入式系统的端到端延迟和可变时序约束。EtoE-DSE采用延迟估计模型和路径查找算法来识别和估计任意端点间路径的端到端延迟。它还使用基于频率的分段过程对设计空间进行分段和剪枝,并结合一种延迟约束优化算法,以高效、准确地探索系统级设计空间。我们通过一个自动驾驶子系统的实际用例,与最先进的HLS DSE方法进行比较,评估了我们的方法。结果表明,相较于先前的DSE方法,我们的方法产生了显著更优的优化结果,结果质量提升高达89.26%,同时在能耗和面积方面高效地识别出帕累托最优配置。