Level-4+ autonomous driving systems (ADS) must run dozens of heterogeneous deep neural networks (DNNs) as end-to-end (E2E) pipelines under a strict latency constraint (<=100 ms), even as execution time varies by up to 3.3x. Cost rules out dedicating isolated hardware to each function in mass-produced ADS, so these DNNs must be densely colocated on a single chip, which introduces shared-resource contention. Tile-based accelerators expose two scheduling opportunities that conventional ADS schedulers do not exploit. First, they provide a tunable degree of parallelism (DoP): assigning more tiles raises DoP and can shorten DNN execution time. Second, they provide hardware-native isolation: tiles can be physically partitioned among co-located DNNs. But using this flexibility is expensive: changing a task's DoP triggers a stop-migrate-restart reallocation of its weights and intermediate features. At ADS task rates of 10-240 Hz, these stalls accumulate along E2E chains and threaten deadlines. Reservation-based schedulers fix DoP and leave this flexibility unused; work-conserving schedulers exploit it but assume reallocation is cheap and treat deadlines as independent. We present ADS-Tile that combines configurable isolation and elastic reservation into a spatio-temporal isolation-sharing space that bounds where and when reallocation occurs; a probabilistic latency model and a DAG-aware runtime scheduler then use this space to decide task colocation and DoP under shared E2E deadlines. On an industry- and academia- derived ADS benchmark, ADS-Tile uses up to 32% fewer tiles than the work-conserving baseline in deadline-critical settings and cuts reallocation-induced wasted processing capacity from 17%-44% to below 1.2%. Controlled spatio-temporal sharing improves resource efficiency and latency predictability for tile-based ADS.
翻译:L4级及以上自动驾驶系统必须运行数十个异构深度神经网络,这些网络以端到端流水线的形式在严格延迟约束(≤100毫秒)下执行,即使执行时间波动可达3.3倍。成本因素制约了量产自动驾驶系统为每个功能配备专用隔离硬件,因此这些DNN必须密集部署于单芯片上,从而引发共享资源竞争。基于Tile的加速器提供了两种传统自动驾驶调度器未能利用的调度机会:其一,可调节并行度——分配更多Tile能提升并行度并缩短DNN执行时间;其二,提供硬件原生隔离——Tile可在共置的DNN间进行物理分区。但利用这种灵活性代价高昂:改变任务的并行度会触发其权重与中间特征的“停止-迁移-重启”重新分配过程。在10-240Hz的任务执行频率下,这些延迟沿端到端链路累积,威胁到截止时间约束。基于预留的调度器固定并行度而放弃灵活性;工作守恒调度器虽利用该灵活性,却假设重新分配成本低廉且将截止时间视为独立约束。我们提出ADS-Tile框架,将可配置隔离与弹性预留相结合,构建出限定重新分配时空范围的时空隔离共享空间;基于该空间,概率延迟模型与DAG感知运行时调度器在共同的端到端截止时间下决策任务共置方案与并行度。在工业界与学术界联合制定的自动驾驶基准测试中,ADS-Tile在截止时间关键场景下比工作守恒基线节省最高32%的Tile资源,并将重新分配引发的处理能力浪费从17%-44%降至1.2%以下。受控的时空共享机制有效提升了基于Tile的自动驾驶系统的资源效率与延迟可预测性。