Efficient Orchestrated AI Workflows Execution on Scale-out Spatial Architecture

Given the increasing complexity of AI applications, traditional spatial architectures frequently fall short. Our analysis identifies a pattern of interconnected, multi-faceted tasks encompassing both AI and general computational processes. In response, we have conceptualized "Orchestrated AI Workflows," an approach that integrates various tasks with logic-driven decisions into dynamic, sophisticated workflows. Specifically, we find that the intrinsic Dual Dynamicity of Orchestrated AI Workflows, namely dynamic execution times and frequencies of Task Blocks, can be effectively represented using the Orchestrated Workflow Graph. Furthermore, the intrinsic Dual Dynamicity poses challenges to existing spatial architecture, namely Indiscriminate Resource Allocation, Reactive Load Rebalancing, and Contagious PEA Idleness. To overcome these challenges, we present Octopus, a scale-out spatial architecture and a suite of advanced scheduling strategies optimized for executing Orchestrated AI Workflows, such as the Discriminate Dual-Scheduling Mechanism, Adaptive TBU Scheduling Strategy, and Proactive Cluster Scheduling Strategy. Our evaluations demonstrate that Octopus significantly outperforms traditional architectures in handling the dynamic demands of Orchestrated AI Workflows, and possesses robust scalability in large scale hardware such as wafer-scale chip.

翻译：鉴于AI应用日益复杂，传统空间架构常显不足。我们的分析揭示了一种由相互关联、多层面任务构成的模式，这些任务同时涵盖AI与通用计算过程。为此，我们提出了"编排式AI工作流"这一概念，该方法将各类任务与逻辑驱动决策整合为动态、复杂的工作流。具体而言，我们发现编排式AI工作流所固有的双重动态性——即任务块的动态执行时间与动态执行频率——可通过编排式工作流图进行有效表征。此外，这种固有双重动态性对现有空间架构提出了挑战，主要表现为无差别资源分配、被动式负载再平衡及传染性处理单元阵列闲置。为应对这些挑战，我们提出了Octopus系统：一种面向扩展型空间架构及配套高级调度策略的解决方案，其专为执行编排式AI工作流而优化，包括差异化双调度机制、自适应任务块单元调度策略及主动式集群调度策略。实验评估表明，Octopus在处理编排式AI工作流的动态需求方面显著优于传统架构，并在晶圆级芯片等大规模硬件上展现出强大的可扩展性。

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日