Real-time perception requires planned resource utilization. Computational planning in real-time perception is governed by two considerations -- accuracy and latency. There exist run-time decisions (e.g. choice of input resolution) that induce tradeoffs affecting performance on a given hardware, arising from intrinsic (content, e.g. scene clutter) and extrinsic (system, e.g. resource contention) characteristics. Earlier runtime execution frameworks employed rule-based decision algorithms and operated with a fixed algorithm latency budget to balance these concerns, which is sub-optimal and inflexible. We propose Chanakya, a learned approximate execution framework that naturally derives from the streaming perception paradigm, to automatically learn decisions induced by these tradeoffs instead. Chanakya is trained via novel rewards balancing accuracy and latency implicitly, without approximating either objectives. Chanakya simultaneously considers intrinsic and extrinsic context, and predicts decisions in a flexible manner. Chanakya, designed with low overhead in mind, outperforms state-of-the-art static and dynamic execution policies on public datasets on both server GPUs and edge devices.
翻译:实时感知需要规划资源利用。实时感知中的计算规划受两个因素制约——准确性和延迟。存在运行决策(例如输入分辨率的选择)会引发权衡,这些权衡因内在(内容,如场景杂乱)和外在(系统,如资源争用)特征而对特定硬件上的性能产生影响。早期的运行时执行框架采用基于规则的决策算法,并在固定的算法延迟预算下运作以平衡这些因素,这种做法次优且缺乏灵活性。我们提出Chanakya,一种源自流式感知范式的学习型近似执行框架,能够自动学习由这些权衡引发的决策。Chanakya通过新颖的奖励机制隐式平衡准确性和延迟,无需对任一目标进行近似。Chanakya同时考虑内在与外在上下文,并以灵活的方式预测决策。基于低开销设计的Chanakya在服务器GPU和边缘设备上的公共数据集中,均优于最先进的静态与动态执行策略。